torstai 14. huhtikuuta 2011

VirtualBox: "FATAL: Could not read from the boot medium! System halted."

I've been configuring test cluster on VirtualBox using Puppet and Cobbler.  The goal is to have a internal network inside the VirtualBox without having any direct connection from the slaves to the external network. I have one machine that will run cobblerd and puppet ("master") while other machines connect to it.

Everything went on pretty smoothly, until I tried to boot up my first slave using PXE. Then I got the error message on the console of the new virtual machine:
FATAL: Could not read from the boot medium! System halted.

It seemed that the new virtual machine didn't even try make a connection to the dhcpd -server running on master. In PXE -boot, first thing that the client does is getting ip -address from the dhcpd -server. But nothing happened.

After two or three hours of testing and checking configuration files, in a moment of desperation, I tried to change the type of the network card from "Intel PRO/1000 MT Desktop" to "PCnet-FAST III (Am79C973)" and everything started to work. Wicked.

Hopefully this will help someone :)

edit:
There seems to be a bug report for this already.

tiistai 12. huhtikuuta 2011

33rd degree in Krakow

Overall, this conference was great. Lots of good speakers with interesting subjects and no product demos disguised to lectures. As a bonus, Krakow is nice city to visit. There were few problems, though. First problem was that everyone talked polish, so it was kind of impossible to drop into discussions during the breaks. It might have helped if there had been some area where you must use english. Second issue was that sessions were pretty beginner level, which can be a good thing too. But I was familiar with most of the subjects.

Main theme in the sessions I followed was alternative languages on JVM, functional programming and system stability. There was a lot of talks about J2EE, but I pretty skipped all of those. It seemed that polyglot programming is the next big thing, especially when you combine it with scalable systems and massive amounts of data.

If I'd have to say which were the best sessions, I'd say that Michael Nygard had sessions which were interesting because I'm currently working on areas he talked about. Venkat Subramaniam was extremely enthusiastic and his sessions would have been enjoyable even if he had talked about JSP -pages, but now the themes made his sessions even better. Jevgeni Kabanov managed to make some deep memory handling stuff understandable by implementing simple processor in Java.

I've been lately thinking that a lot of articles and seminar sessions are directed to wrong audience. I'd say that most of developers, who reads articles and participates seminars, knows that writing tests is good, using multiple languages will boost your productivity and so on. It is the management who should be preached about writing tests or using right tools for the job. The evil management stands prohibits developers from doing the Right Thing (tm). But during Venkat Subranamians and Ted Newards final talks I realized that when I think that way, I'm actually avoiding the responsibility.

I definitely try to participate next year, and I'll convince few friends to join me.

Below you'll find long, poorly written summaries on the sessions which I participated.


Linda Rising - Deception and estimation


In this session, Linda Rising talked about why estimations tend to optimistic. Main reason seems to be that we are hardwired deceive ourselves. We also tend to refuse to think about or process information we do not like and we can even distort data to comply with our own opinions. Of course, we don't see this when we do it.

We tend to overestimate our own abilities. For example, we estimate that we will live ~10 years longer than statistics tell. But overestimation of our abilities might not be all bad; Our ancestors were optimistic, if they've been realistic about survival and odds they've would not even tried.

When doing agile development, estimation is done in small intervals. Then you might make a good guess, but estimates won't get any better. At least you'll never get it right. The main thing to remember is that estimations aren't facts.

"There are no facts about the future." David T. Hulett

http://estimategoat.com/


Matt Raible - Comparing JVM Web Frameworks

Matt Raibles' JVM Web Framework comparison is pretty well known, just google for it. Main takeaway from this session was a method for evaluating what framework is best for your current needs. One interesting evaluation point was passion: if there's is someone in your team who is passionate about some framework, you should take that into account when making decision.


Nathaniel T. Schutta - Hacking Your Brain For Fun and Profit


This was about the brain.

Sleep is important (I should know, I'm writing this after late arrival to home after conference). During sleep, brain is active and processes events that happened during the day sorting what is useful and what is not. Also naps are proven to boost performance, there's been a study that shows that 26 minute nap increases performance by 34%. But the main thing is that you should know your own sleep patterns: know when you are at your best and schedule your time according that.

Exercising is another important thing which will improve your performance and, of course, health. By exercising, Nathaniel didn't mean full-fledged marathon training. Even moderate amounts are enough. You could use standing desks or even "treadmill desk". One interesting method was walking meeting. Nathaniel does his one-on-one meetings with his boss while walking around.

Third main point in this session was all about learning and getting better in what you do. Change is constant, so there's always stuff to learn. Learning happens best when there's elaborate, meaningful stories and examples which include context (war stories). Spaced repetition is also important.

You go through different stages in your development when your understanding evolves: from beginner to expert, from simplistic to complex to profoundly simple. For beginners, rules are important, but they kill experts.

In today's world, there's too much information. This leads to "infotention", which means that you give a little bit attention to many things. Your attention is precious, don't waste it. You should start an information diet, meaning that for some time (days, weeks, months) you select what to ignore.

Last thing was writing down your ideas. This is important because you will forget what you've thought if don't write them down. And this is meaningful because ideas beget ideas.


Ted Neward - Busy developers guide to Scala: Patterns


Scala is a language which people have said to be the next Java, which I personally do not agree. Nevertheless, Scala has a lot of interesting language structures that removes need of some patterns and changes some while creating new ones.

When you have first class functions, shocking amount of patterns go away (Chain of responsibility -> list of functions, visitor -> pattern matching).


Neal Ford - The Productive Programmer


Last session of day one was a three hours long marathon about programmer productivity. Neal shared a lot of tips, tricks and programs which will make everyday life a lot more productive. He also talked about automation and tools. Main thing here is that you should know your tools and use them as much as possible. But even then, remember that tools aren't the product. Don't shave the yaks.


Jevgeni Kabanov - Do you really get memory?


Jevgeni Kabanov is the CTO and founder of ZeroTurnaround. Session had it's roots in two blog posts, http://dow.ngra.de/2008/10/27/when-systemcurrenttimemillis-is-too-slow/ and http://dow.ngra.de/2008/10/28/what-do-we-really-know-about-non-blocking-concurrency-in-java/.

Jevgeni showed us a simple processor model written in Java (no working code, though). During this he talked about how memory is accessed on operating system (and lower) level, what does volatile and synchronized mean, how heap and garbage collection works.

Few quotes from this session:

"Digging into Java and found some weird stuff there, memory in Java is weirdest abstraction ever."

"It's all about memory: most performance problems are memory related."

"You are always running in a distributed system."

"There is always something exiting in the garbage collection world."


Venkat Subramaniam - State of Scala


This session was mainly about new features in Scala 2.8, so there's not too much to say about that. Venkat talked about streams, vectors (and Tries) etc. the most important unheard feature (for me) was @tailrec -annotation. With this annotation, you can have a compile time error if annotated function is not tail recursive.

Of all sessions during 33 Degree, Venkats'¨sessions were amongst the top five. He's just so enthusiastic about programming and knows how to capture the audience.


Steve Freeman - Fractal TDD: Using tests to drive system design


Main point in this session was the division between unit testing and system testing. Unit testing makes system easier to modify, system testing makes it easier to support. A lot of stuff that is needed for system testing is extremely useful when running in production. For example, end-to-end -testing needs that following things are possible :

  • know what the system is doing,
  • know when the system has stopped,
  • know when the system has gone wrong,
  • know why the system has gone wrong,
  • restore the system to good state

All of those are required for automated end-to-end testing.

For previous to be successful, a good logging is important. Logs are part of UI, but usually a lot of decision on logs are done on too low level. This leads to what Steve calls Logorrhea, meaning inconsistent log levels, inconsistent formats, duplicated reports etc. Solution to this is to move logging to right domain -> monitoring events instead of logging using structured messages.Then error reporting, self healing, alerts can listen these events and act accordingly. This leads to observable system behavior, meaning that you observe stuff that is useful instead stuff that people write when they don't know what else to do

One interesting point in this session was that you shouldn't mock 3rd party integration. This is because you cannot change that API and then you'll lose one important part of TDD as you cannot modify your design. So you should write an adapter and mock that in your unit tests. For 3rd party integration, you should have test integration separately while including your adapter in these tests.


Nathaniel Scutta - HTML 5 Fact and Fiction


Pretty standard HTML5 things about development of standard and features in HTML 5. You can already start using HTML 5 features, if you use feature detection (http://www.modernizr.com/)


Venkat Subramaniam - Programming Clojure


About Clojure syntax:
"If you're used to lisp, it's very easy. If you're not used to lisp, get used to it."

Venkat said also "I program in 8 languages and there wasn't single language cried and kicked and screamed when learning syntax." meaning that new syntax will always look ugly.



Steve Freeman - Five years of change, No outages


The example application was a data warehouse for bond state data. What the system did was receiving updates from different systems and send those updates forward after some manipulation. This was 3rd. or 4th attempt to develop this system. Main reasons for success were following.


  • Started with clear team culture -> culture has been holding up through 2-3 generations
    • The team had a culture which required to make things right. All members had experiences about projects which had been failures, so they wanted to this one right.
    • People were hired for attitude, degree of productivity. Outside researcher was with the team for a while and commented "Other teams talk about quality, you seem to be doing it"
  • First acceptance test two weeks from beginning
    • The domain was such that it was easy to take one vertical segment at a time. This way it was possible to demonstrate used methods and progress from early on
  • There was an existing system, so they could use real data
    • Like previous, this helped verifying system.
  • Own deployment environment, no operations people to say what to use -> made possible to script deployment -> easy to set up environments -> Deployment to production is 10 minutes
  • Using Fit to show how system works -> Analysts could write new tests and figure out what is happening.
  • Right tests on right level makes it possible that you don't have to remember everything.
  • A lot of effort went into testing, not in coding tests but in discussions about what the system should be doing


Neal Ford - Functional Thinking


"OOP makes working with state easier. FP makes elimination of state easier"

In this session, Neil went through some basic functional programming methods and styles. He also pointed out that you don't need functional programming language, you can think and code functionally in Java also.

In the end, he laid out five principles:

  • Immutability instead of state transitions
  • Results over steps
  • Composition over structure
  • Declarative over imperative
  • Paradigm over tool


Main thing in this session was that you can use and benefit from functional thinking even if you don't use language which supports functions as first class citizens


Patrycja Węgrzynowicz - Automated Bug Hunting


Last session of second day was about code quality and tools which can be used. Despite of some technical problems, the presentation went on pretty smoothly. There was some interesting techniques and tools, like usage or test Oracles and Theorem provers.


Michael Nygard - Failure Comes in Flavors


For me, this was probably the most interesting session. Michael had few good war stories about weird bugs which affected millions of people.

The most important thing to have is failure oriented mindset. Every system, every network cable, everything will try to pull your system down. There are a lot of reasons why different systems are brought to halt. If every system was unique, there wouldn't be any hope. Luckily we have patterns in failures.

  • Integration points, out of process calls
    • Every socket, process, pipe or remote procedure call can and will eventually kill your system
    • Timeouts, Circuit breakers
  • Chain reaction
    • Failure in one component raises probability of failure in its peers
    • Common in search engines and application servers
    • Resource leaks are usual
    • Bulkheads -> separate horizontal layer to different pools
  • Cascading failure
    • Layer has been lost, failure moves vertically
    • SOA -> one big failure domain
    • "Damage containment"
    • It's not realistic to eliminate every bug
    • Timeouts, Circuit breakers
  • Users
    • sheer traffic, flash mobs, click-happy
    • malicious users
    • screen scrapers, badly configured proxy servers
  • Attack of Self-Denial
    • Good marketing can kill your system at any time
  • Two types of "bad" users
    • Buyers
      • expensive services -> ssl, integrations, pages
    • bargain hunters, screen scrapers
      • useless sessions
      • divert, throttle or avoid creating sessions
      • especially for spiders
  • Self healing
    • Turn off expensive features
    • Use lightweight landing sites (static)
    • Divert/throttle, good user experience for few users even if you cannot serve everyone
    • Reduce burden of serving each user, watch memory
    • Only allow the user's second click to reach application servers
    • Differentiate people from bots, don't keep sessions for bots
    • Minimize memory
    • Weird things happen
    • Keep lines of communication open
      • support the marketers, they'll do what they want if you say no
      • ie. buy that from 3rd party, integrate it to system -> BOOM!
  • Blocked threads
    • Most common form of crash: all request threads blocked
    • Very difficult to test
    • Permutation of code pathways
    • timing, amount of traffic
    • Keep threads isolated/do not use threads
      • java.util.concurrent
  • Unbalancing capacities
    • Traffic floods sometimes start inside the data center walls
    • Chained systems, where lower has less resources -> might not be issue, depends on traffic and usage
    • Ratios are different in production vs. development
    • Watch out for changes in traffic patterns
    • Funneling of traffic
  • Slow responses
    • Connection refused -> fast failure, thread released
    • Slow response -> thread tied down, user wait
    • On slow response, systems and users try again (timeout -> retry)
    • Causes:
      • Too much load, transient network saturation, firewall overloaded, protocol with built in retries (nfs, dns) hosts file inside own center, use conf management
      • Chatty remote protocols
  • Unbounded result sets
    • Development and QA with small result sets
    • Other systems doesn't restrict result sets, be careful in SOA
    • Realistic data volumes, copy data from production
    • External systems can change overnight
There is a lot of stuff that can affect system stability. Many of them are things you probably can't even start thinking about.


Simon Ritter - The Future of the Java Platform: Java SE 7 and Java SE 8


Pretty basic stuff about new features, mainly just for SE 7. A lot of this was familiar from online articles, but still good session, especially about the reasons why Java SE 7 was delayed.


Matthew McGullough - Hadoop: Divide and Conquer Gigantic Datasets


This was also pretty much basic Hadoop stuff, nothing too fancy but good presentation. Interesting part was the history, structure and usage of Hadoop. It is quite interesting idea to store all data instead of summarizing it regularly. When you combine this with sensor networks, you might have something interesting stuff going on.


Neal Ford - Abstraction Distractions


Everything we do is a abstraction over another. Abstraction distraction happens when we think something is real although it is just an abstraction.

lesson #1
Don't mistake the abstraction for the real thing

Lesson #2
Always understand 1 level below your usual abstraction

Lesson #3
Once internalized, abstractions are hard to rid of

Lesson #4
Abstraction are both walls & prisons

Lesson #5
Don't name things with underlying details

Lesson #6
Your abstraction isn't perfect

Lesson #7
Understand the implications of rigidity

Lesson #8
Good apis are not mereley high-evel or low-level; they're both at once

Lesson #9
Generalize 80% cases; get out of the way for the rest

Lesson #10
Composability, the One true abstraction?



Michael Nygard - Architect for Scale


System scalability is hot topic today, and everyone seems to be concentrating on it (even if they don't have to). So it was nice to hear someone who has actually operated large scale systems.

Sizes of systems:
Medium 1 million requests per hour, 100 nodes, no need to talk about scalability, application server centric
Large scale 10M/hour, 1000 nodes, data centric, automated operations, async messaging, multiple datastores, caching servers, different views of universe and time per server, "where I store data?"
Extreme scale 10B/hour, 10000 nodes, Operations centric, "How I deploy?"

Scalability:
Purely technical definition: Reduction in elapsed processor time due to parallelization of workload
Workload can be divided into two different sections: pure serial section and parallel section.
Contention and coherency
Contention on serial resources
Coherency = state across multiple processes, needs time

If you keep adding more nodes, workload goes down. This is due the coherency, more and more time goes into keeping everything up-to-date. To keep amount of nodes small enough,, big applications must be partitioned. There are two easy partitioning schemes, horizontal and functional. In horizontal partitioning, data is distributed using keys. This is best applied by application logic. Functional partitioning means that different function/transactions are done on different servers. This can be accomplished on client side or by using a load balancer.

You can reduce the serial factor too. Serial factor can be made smaller by using reverse proxies, web accelerators or CDNs. You can also make responses smaller or using caching. But beware, wrong configurations can actually weaken your application performance.

One of the most important things in application development is usually forgotten. It is operations, those guys who keep applications running. As applications grow in size, the amount of administrator and other support staff rises supra-linearly. You can reduce the amount of needed administrator by using automation in deployment and configuration.



Venkat Subramaniam - It could be heaven or it could be hell: On being a Polyglot Programmer


Venkat had another great session, this time about being a polyglot programmer, ie. programmer who uses multiple different languages. Session started with claim "what language we use moulds our thoughts". This might lead to suboptimal resolutions to problems. But if you know multiple languages, you might be able to see a different way to resolve the same problem. And it's possible to use language "specific" structures in other languages. So if you only know one type of language, you're at a significant disadvantage.

Java programmers share unrelenting hope ( I can do that in Java too!). But there are a lot of things you can do more easily with different languages, for example XML-generationin Groovy. Still, the Java platform has a lot of good feature, mainly powerful VM, good libraries and garbage collection. And since 1995, the virtual machine and libraries has gotten better, but what about Java language? Luckily other languages on the JVM can use Java's good features.

The hard part of starting to use multiple languages is convincing others to allow it. Change is hard. One way to do this just not telling them that you're using different language and just showing the results. But don't be infatuated with technology.

Some people change when they see the light, other when they feel the heat

Ted Neward - Rethinking "Enterprise"


Teaching follows same pattern from first grade to industrial courses: first you get a solution (teach something) and then you're given a problem to solve. So the problem and solution is always near each other. This teaches us to always use the latest thing we learned.

Resist the Temptation of the familiar! Because every project is different you should reject the "Goal of Reuse".

It's common to try to find the best solution for a problem by asking others or searching for best practices. But the problem is usually so complicated that just to facilitate any answer at all, we have to use such simplified models of the problem that any result is essentially useless. This is one result from the fact that every project is unique. So eschew the "best practice". Best practice actually gets you not the best, but merely the average. "Best practices" are our attempts to avoid thinking. We're afraid that we are wrong, so we try to find answers from anothers so we could hide behind their backs.

But there are no shortcuts. You have to do katas, meaning that you have to code small systems using different frameworks, libraries, languages and so on. You have to develop your own evaluation function for every case.

There is no spoon.