I know that a lot of big thing have happened in the past week which are worth writing about including the death of Marvin Zindler, the collapsed I-35 West bridge in Minneapolis, and Metro's giving everyone a choice on Richmond as to how they are going to get railroaded (but at least they will have some public artwork to look at). Still, I wanted to blog about some notable events which are happening in Nerd World.
1) First, the news came across the wires two weeks ago that checkers has been solved. I should clarify that statement a bit. From a Game Theory standpoint, the game of checkers has been weakly solved, meaning that once you get down to a position where you have 10 checkers on the board, then the developed program cannot lose! The program will always at least obtain a draw, if not a win.
Now I can hear right now a lot of chess players jumping to their feet, shouting that chess is ultimately unsolvable. All I can say is that most chess players have programs these days, which says quite a bit about the state of player's attitudes these days.
2) Neal Krawetz, a security researcher, has written a program which can compare the metadata on image files and determine if a file is an original or whether it has been Photoshopped. The article goes on to describe how most of the Al-Queda images we have been seeing in the media have possibly been doctored up.
3) Several weeks ago, I noted how Microsoft was developing touch screen computing, which would revolutionize the computing experience. I noted how developers in Penguin land needed to get their act together or M$ would leave them behind. Well, it seems that developers in Penguin land are doing just that with MPX (Multi Pointer X-Window). The good news is that MPX will recognize multiple users at once.
4) Dell is indeed shaking up Linux land and is seeing some demand for Linux. The good news is that Dell (and Google) is / are starting to lean on developers for better drivers.
5) Meanwhile Red Hat, which is the leading Linux distributor, is not sitting still. They are coming out with their own Linux desktop. Meanwhile, Information Week brings up the hoary old argument of whether 300 different Linux distros is fragmenting and hurting the adaption of Linux. Remember that Unix forked in the 1980's between Sun's Solaris, IBM's AIX, HP's HP-UX, and lots of others, which opened the door to Microsoft winning over the desktop.
6) Last, but probably most interestingly, here is an article with noted Australian (former) kernel developer Con Kolivas, where he talks about his frustrations with kernel development which led him to quit working on Linux.
My own .02 worth is that Linux will continue to be a niche hobby OS until we can develop reliable drivers which always work! Also, the Open Office suite needs to be improved so that it always can open and deal with Adobe and Microsoft Office documentation.
A quick story about Linux and drivers. Several weeks ago that the Big Evil Company, one of our processors came in with a USB stick which he wanted to mount on his Linux desktop. My senior counterpart, an incredibly knowledgeable, hard working, and diligent guy, fiddled for some 2 hours trying to get a USB stick to work on his desktop. There's more to the story than issuing a simple mount command, but if Linux is going to make inroads on $200 - $800 copies of Microsoft Vista, then we need to make sure that the user experience is painless and that things work the first time and every time - end of story. Otherwise, the world will continue to pay the steep Microsoft premium and the world will stay a WinTel type place.
Wizard
This year is turning out to be a very interesting one in the world of clusters, grids, and supercomputing. Advances in computing tend to move forward in fits and starts, whereby developments in one area of computing often rush ahead of development in others. Then those areas which might have fallen behind then experience breakthroughs which may allow them to catch up.
In the past decade, the plummeting costs of processing power, local disk storage, and the development of ethernet drivers by people like Donald Becker has allowed for the construction of low cost clusters of computers which can work together to solve problems. That sounds great, but that in of itself doesn't solve all problems in supercomputing, indeed it creates others. Another major issue in supercomputing is that you often want to have areas of centralized storage, along with the files and filesystems that reside on that storage, which all of the computers in your cluster can read from and write to. Effectively, the computers that are part of your clusters will become clients to those pools of storage. There are two general types of such storage, NAS or SAN's. And, lastly for this epistle, a new issue has cropped up with the construction of large clusters of compute "nodes", namely that such clusters are so cheap that they can be scaled to the point where they strain the electrical capacity of modern day data centers. Just last year we had to retrofit our data center to add 900 amps of electrical power to our data center at VLICA.
The electrical power problem has spurred renewed research into an old hoary idea in computing, namely using graphical cards in computers to help do more general purpose computational work and not just good ol' fashioned things like rendering. In line with what I wrote above about innovations in computing often developing at different rates, graphical cards have improved their capabilities at far faster rates than general purpose CPU's. This has spawned the two giants in the graphics card industry, ATI (which was actually acquired by AMD) and NVidia, to develop solutions to meet this emerging market. Our developers at VLICA have tried several ideas. At this time it looks like NVidia's CUDA suite is the current front runner, but the computing industry is nothing but competitive and it can't be said that the game is over. Not by a long shot.
A primary issue with GPGPU solutions still outstanding is that, depending upon your applications, they can outperform general purpose CPU solutions by several orders of magnitude. However the cost of the cards, along the cost of the hardware (often you have to have 3U sized servers to house the cards) often means that the price per server is also several times that of a cheap Dell or HP server. In other words, as of the current time the increased cost per server cancels out the increased performance of the GPU's, but that is changing. Also, multicore processors are coming on strong, giving GPGPU's a new competitor. The main issue here is the programmers would need to start writing code that allows for multiple threads to execute simultaneously and coding is hard enough.
The other item of interest is that in the world of Unix and Linux, a 20+ year staple solution of filesystem sharing over networks is getting an overdue. That system is of course NFS. NFS was written for a world where you had one server serving up files on so called "mount points" defined on the server, and a fairly small number of clients. Originally the mount points were defined in static files, however over the years the number of mount points and clients grew larger and larger. Eventually someone came up with the mechanism of doing mounts automatically via a product called autofs so that systems administrators would not have to go through the trouble of adding and maintaining and endless number of new mount points on static files on an endlessly increasing numbers of clients. The actual mounting (and umounting) of filesystems is done via the Autofs mechanism so that users and administrators do not have to issue endless streams of commands to make filesystems available (and release them when they are through with them) to clients.
This is all a fine and dandy solution, but running NFS this way creates serious problems. Namely, all the I/O has to go to and from the server which can overload the server's networking capability. The number of mounts being served up can quickly reach into the thousands on a large cluster. Also, NFS has never guaranteed that multiple clients which might be accessing a file (or filesystem) would see the same data on a particular file if multiple clients were attempting reads and writes to the file simultaneously. File locking was always an issue, as was recovery in the event that a server or client failed. There are also security issues which I won't get into.
Several vendors have offered solutions for firms which have large SAN or NAS environments with lots of clients accessing data simultaneously. At VLICA, we have used Polyserve as our solution. What Polyserve offers is their own filesystem which allows for concurrent reads and writes to filesystems in a SAN or NAS from many NFS servers, all of which can act in concert. You install and define Polyserve on NFS servers, which all are aware of each other. They form a "matrix" which is composed of all of the NFS servers are defined at installation, as well as all of the NFS filesystems which are meant to be a part of this matrix. You then install a load balancing system between the many clients and the servers in the Polyserve matrix. This distributes I/O amongst the multiple NFS fileservers which have Polyserve installed on them and from there the servers read and write to the SAN or NAS. Such a solution is known as a clustered filesystem.
However there are limits to what vendor solutions can do to address the many clients writing to a pool of storage. The single biggest problem is that the performance is not linearly scalable. What I mean by that statement is this: If you have 4 servers which between them can achieve 1GB per second of I/O between the SAN and clients, then 8 servers will have less than 2GB per second of I/O performance. Vendors will claim that their solutions are linearly scalable, but their claims don't stand up under real world conditions. Also, you have the problem of having to install ever greater number of servers into your NFS server pool. A classic summary of problems involving the use of NFS can be read about in the paper entitled "Why NFS sucks" which was written by Olaf Kirch, noted NFS developer, for the 2006 Linux Symposium.
A more elegant solution to the problem of NFS scalability would be for clients to be able to write directly to the storage pool and in parallel, effectively cutting out the "middle man" of the NFS server. This is exactly what a minor revision of the NFS version 4 (NFSv4), called pNFS is designed to do. A highly simplified way of describing how pNFS works is that the server essentially serves up only metadata to clients telling them how and where to locate the files or filesystems in the storage pool. The I/O itself is not driven through the servers. The clients themselves have have a so called "layout driver" in the Linux kernel which takes the metadata served up by the server and I/O operations then run between the client and the storage pool. My supervisor is currently running tests using pNFS on a simple setup and so far we are getting nearly linearly scalable increases in I/O performance in some instances. This stuff does show promise. pNFS will also most certainly help adaption of NFSv4 since many IT managers have not seen many good reasons, outside of improved security, for adapting NFSv4 into their shops.
Enough for now. This has been a long entry and it is getting late. Another day of problems and black art sorcery await tomorrow at VLICA.
Folks, I enjoy working with Linux but Microsoft has been working on something for several years now that has put them back on the front end of the computing innovation curve and will blow the open source movement completely off their rockers. Check out these videos of surface touch screen computing. This is better than anything you might have seen in The Minority Report. Perhaps this is why the Vista operating system took so long to come out?
This is going to be a monster hit. Jeff Han and his Perceptive Pixel company have made sure that computing will never be the same again. The Wizard predicts that in ten years computer keyboards and mice are going to be obsolete and will be filling up space in our dumpsters and landfills.
Ciao for now.
There has been a lot of stuff happening in the IT world recently. This blog entry is a summary of some of the recent going's on in Nerd World.
1) This article concerning the implementation of laptop computers and academic achievement was posted in the NY Times via Slashdot. Briefly, school district across the country are starting to abandon the implementation of laptops in schools and giving them to students. Some excerpts from the article include:
LIVERPOOL, N.Y. — The students at Liverpool High have used their school-issued laptops to exchange answers on tests, download pornography and hack into local businesses. When the school tightened its network security, a 10th grader not only found a way around it but also posted step-by-step instructions on the Web for others to follow (which they did).
Scores of the leased laptops break down each month, and every other morning, when the entire school has study hall, the network inevitably freezes because of the sheer number of students roaming the Internet instead of getting help from teachers.
snip
“After seven years, there was literally no evidence it had any impact on student achievement — none,” said Mark Lawson, the school board president here in Liverpool.
snip
Yet school officials here and in several other places said laptops had been abused by students, did not fit into lesson plans, and showed little, if any, measurable effect on grades and test scores at a time of increased pressure to meet state standards. Districts have dropped laptop programs after resistance from teachers, logistical and technical problems, and escalating maintenance costs.
Such disappointments are the latest example of how technology is often embraced by philanthropists and political leaders as a quick fix, only to leave teachers flummoxed about how best to integrate the new gadgets into curriculums. Last month, the United States Department of Education released a study showing no difference in academic achievement between students who used educational software programs for math and reading and those who did not.
2) This is an article on a possible upcoming shakeout happening within IBM. Scary as it seems, my company, VLICA, is currently considering an outsourcing agreement with Big Blue.
Addendum - May 7, 2007. Read this counter story - Wizard.
I have watched IBM now for nearly 20 years. Back in the beginning of the 1990's, IBM was under severe attack from the emerging workstation / client - server model of computing. Everyone and anyone in the computing industry, not to mention the chattering classes and punditry just knew that we were witnessing the fall of an American titan. Fast forward to 2007 and we see that IBM is still here in one piece, even though Compaq, DEC, Tandem, Wang Laboratories, all went under or were merged with others. Other famous names in the industry such as Silicon Graphics have barely managed to slip the bankrupcy noose. Meanwhile Big Blue rumbles on and on and on.
Now none of this is to say that IBM has made some missteps along the way. IBM was effectively forced out of the commodity desktop market by Dell and others. Still, IBM has its patents, its awesome research facilities, and its one stop services and solutions packages it offers to large organizations, albeit at a price. IBM also managed to bumble through to the realization that Linux was a winner circa 1997 or so. Once the IBM brass got behind it, IBM did do much to legitimize the OS as well as contribute much to kernel development. From all of this, I would say that IBM will survive. It looks as though the company is having to go through one of its periodic shakeouts though.
3) And speaking of companies going through transitions, we found this week that Dell has elected to support Ubuntu Linux at the desktop. I work with Dell / Redhat Linux on Precision workstations and PowerEdge servers at work.
This decision can be seen through the lens that Dell has been struggling lately vis - a - vis a resurgent HP now that HP has succeeded in getting over the problems that Carly Fiorina had in trying to manage the Compaq / HP merger. Also, choosing Ubuntu allows Dell to avoid having to pay Redhat licensing fees for offering Linux on their desktops and laptops. This is great news for Linux.
4) and finally, the Microsoft / Yahoo merger as an idea for countering the Google train. I would both hate and love to see M$ buy Yahoo. The rumored price is something like $50 billion! I would hate to see a merger between the two because a merger would ruin everything that is great about Yahoo - its groups, the news layouts, etc. I love Yahoo.
At the same time, I would love to see M$ buy Yahoo because it would be a huge waste of money on the part of Microsoft. Both companies overlap on many of the same Internet products and services, ergo in my view M$ would simply be throwing lots of money away just to find that they have hardly gained anything.
Until next time.
Wizard
But that isn't the end of my tales on power and cooling in data center. Now with our proud new cluster in place, we were in business. Things purred along nicely for about six months. Then in June 2006, I was sitting in the data center around noon one day with some summer college interns. The data center manager was on vacation and everyone else was out to lunch - literally. Suddenly I noticed a flickering in the light fixtures. The flickering ran up and down the entire length of the computer room floor. The flickering then stopped, but it then repeated about 1 minute later. A feeling of dread came over me. I have worked in data centers for many years and have been through many electrical problems and power outages, but I had never seen anything like that before.
Moments later the alarms went off on the UPS's and the batteries started draining. I fired off an emergency email to all IS systems personnel summoning them to the data center right that moment. That was in fact the biggest problem - convincing your coworkers that there is a problem when they aren't there. My servers? They're still up! What the **** are you worried about?
And so it was. I contacted my boss and he told me to start shutting down our stuff. Meanwhile after about 10 minutes I heard a deep sounding collective POOF that echoed across the data center. I looked in horror as I saw that all of the air handlers had lost power simultaneously. I resent my emails to everyone to get them into the data center to start some kind of orderly shutdown, considering that we were now on a extremely short time from to try to attain one.
We eventually were able to get things back together. We had blown some fuses on our air handlers and had lost a phase of electrical power. We turned on the data center after some hours and things ran over the weekend. Then we ran into the same problem over the weekend and again during the next week.
Eventually this led to meetings and pow-wows. The local utility basically said that we were pushing our limits on our allocated line feeds and we in turn queried the electrical contractor who put the permits in to do our electrical work for our cluster expansions. It seems the contractor had rushed the process and had assured us that the necessary power would be there when in fact there were no such assurances. We eventually had to schedule another shutdown as part of an office wide complex shutdown (we rent out of a major office complex which has three buildings in it). Then the power company had to add the needed infrastructure to give us an increase from 1,600 amps to 2,500 amps (at 480 volt , three phase power).
And so it was. What lessons were there to be had in all of this?
1) As your power and cooling demands grow, the potential scope for **** ups will also grow. At first it was only blown power strips at the rack level. Eventually it was overheating the entire data center and overloading the electrical line feeds into the building. You need to do your homework in advance to avoid these problems.
2) Study these issues. Become at least mildly familiar with electrical power and cooling terms, concepts and issues.
Enough for now.
I've worked in the IT business for a long time. One of the hot item issues that has reared its' head in recent years is the issue of providing sufficient electrical power and cooling to data centers. As everyone in the business knows, the ongoing drop in hardware prices coupled with their increases in computational power (Moore's Law) have made it possible for companies to amass tremendous computing power at prices which used to cost a mint. This has led to configurations of as many as 80+ servers in a single server rack and for companies to have racks and racks of servers.
What amuses me is that even though many in the business foresaw that the trend towards commodity, cluster oriented computing was going to happen and were patting themselves on the back for knowing about Moore's Law, what nobody ever seems to have realized is that these developments would eventually put immense pressure on those dreary, unsexy, below the radar issues of data center power and cooling until it actually came along. Then hey wait a minute! Then people realized that this was going to cause a problem with powering and cooling data centers. So much for how much people know about the future. Now the federal government has gotten into the act via the EPA.
I decided to write this to share my company's experience with power and cooling over the past 13 years. Briefly, I work as a sysadmin in a resonably large data center. This data center was designed in 1994 to accomodate an IBM ES9000 mainframe, 20 or so SUN servers, some Compaq servers with Microsoft Windows NT, and a few other sundry servers. We were an early adapter of Linux and started circa 1997-1998 to create relatively small clusters of 128 nodes for seismic data processing. These clusters have gotten bigger and bigger over time.
The first environmental problem we encountered was that in our third generation of cluster builds, we were using Dell 450 workstations stacked 11 or so in a rack. We had 110 volt strips installed on each rack. When we plugged more than 4 workstations into a power strip, that blew the circuit breaker on each power strip. The data center manager then decided to have the entire data center rewired to 208-230 volt power. We also plugged fewer workstations into each power strip and also used 220 volt connections at the rack. This stopped the tripping of the power strips.
The next problem we ran into was about 2 years later. By this time our clusters had grown considerably. We had 500 Dell 1750 1U servers installed at around 38 servers per rack in addition to about 520 of the above mentioned Dell 450 workstations mounted 11 or so to each rack. Up through this time the original power and cooling configuration in our data center, which was composed of 6 PDU's and 7 20 ton air handlers was handling the load all right. Then around January 2005 we installed 500 Dell 1855 blade servers at 50 blades per rack and we deinstalled about 100 of the old Dell 450's. That gave us about 1,400 nodes in our clusters. We had some extra 225 KVA PDU's installed in the data center to provide the extra power needed to handle the new electrical load.
We set up the 1750's and the 1855's in a cold aisle enclosure, with walk in doors on either side, where we had floor fans (we have a raised floor) forcing up air into the enclosure and the air would flow out through the servers. We had the servers facing outwards so that we had the hot aisles on the outsides of the enclosures. Due to the fact that we had given up part of the data center for some corporate meeting rooms, we were also now working with the handicap of having to deal with a smaller area of floor space.
It was at this point that the environmentals in our data center started to break down. We discovered that the data center was all right when there was no work going on, or even when we were employing up to about 500 of the nodes. Anything over that and the data center began to really warm up. When we were employing all 1,000 of the 1750's and 1855', it really got hot! In the early days of this new cluster install, we routinely received hot temperature alarms from our servers.
The data center manager, my supervisor and I tried a number different strategies for cooling the nodes down. We did all the best practices. We sealed off every last crevice in the enclosed area, including using blanking panels on the server racks. We sealed off nearly every open space we could find in floor tile cutouts. We put porous ceiling tiles to allow the hot air to escape more freely. We put fans on top of the racks to help blow the volumes of hot air out of the racks. This all more less stabilized the clusters, but the data center got to be so hot that we had to have fans running if we were working in there when the clusters were fully employed. The ambient temperature around the back wall where our clusters were deployed was 96 degrees Fahrenheit when the clusters were running at full power.
Occasional visitors to the data center inevitably complained about how hot it was in the data center. Weren't we doing something about this? My supervisor was complaining that he wanted to hire an engineering student who knew about thermodynamics and who could tell him how much heat his clusters were generating. He didn't know. He spent some money and sent me to a seminar on cooling data centers that was held here in town. It was a fairly well attended event, but even after hearing these consultants talk for half a day, they didn't tell me anything I didn't know already. They certainly didn't tell me how to figure out how much cooling we were going to need to deal with our data center. What a bunch of worthless bastards consultants.
Meanwhile the data center manager was going around telling people that nothing could be done about the matter. He would make statements saying that "this air handler over there can cool all of those Dell 450's". Great. For a long time, I trusted his judgment, but it was becoming clear that I couldn't do that anymore. It was time to start second guessing his judgment.
Finally the news came that broke the camel's back and forced the matter. We received news that another cluster expansion was coming, this time with another 1,000 nodes. The bid went out and Dell won again with 1855 blades. This time I knew that something had to give. The data center manager was a nice guy, but he was getting up there in age and this new world of computing had put us in unexplored country for a long time. For a long time I had heard his pronouncements, but a question finally dawned on me. How did he "know" that the air handler in the corner could handle "all of those Dell 450's"? it was time for me to start doing some research.
Thank the Heavens for the Internet! At the end of the week one Friday night in the summer of 2005, I sat there for hours and hours looking for information. Finally I found some websites which told me what I was looking for. What I found was that 1 watt of electrical power consumed generates 3.41 BTU's of heat. Suddenly the world became clear to me and I knew what to do.
I went around the data center and added up all of the equipment in there, including the Windows servers which was not under our jurisdiction. I did some quick calculations and my jaws dropped. By my math, we were about 60 tons of air short in the data center! No wonder it was so hot in there. We needed three full 20 ton air handlers in there right now just to stay even. I wrote an email where I detailed my findings to the relevant parties, doing my very best to be diplomatic towards the data center manager. My writings worked. When we did the install of the 1,000 nodes, the data center manager had 157 tons of new cooling added to our current stable of air handlers. Now we have about 295 tons of cooling in our data center.
Here are the yardsticks:
1) 1 watt of electrical power generates = 3.41 BTU's of heat. I simply use 3.5 BTU's of heat, just to make the math come out nice. It only adds 3 percent to the actual number and you can use that as a safety pad in the event something goes wrong.
2) 1 ton of cooling = 12,000 BTU's of heat.
So as an example, Dell's 1855 blade servers generate 300 watts of power at full utilization. When 10 1855's are set in a chassis, then the chassis and the blades consume 3,600 watts. Since we were installing 1,000 1855 blades, the cooling required would be:
100 * 3,600 watts = 360,000 watts.
360,000 watts * 3.5 BTU's / watt = 1,260,000 BTU's of heat.
1,260,000 BTU's / 12,000 BTU's per ton of cooling = 105 tons of cooling.
So we were going to need 105 tons of cooling to cool 1,000 Dell 1855 blades. Of course the number is slightly smaller than that because the actual watt / BTU ratio is 1 / 3.41 and not 1 / 3.5, but you should get the picture.
Again, the 1 watt = 3.41 BTU's of heat generated is far and away the most important thing you will ever need to know about cooling a data center! All other issues, whether you are arguing about whether to use a raised floor or not, blanking panels, sealing off floor tile cutouts, and so on, pale in insignificance to that fact of physics. Of course all other industry best practice cooling issues do matter, but they ultimately matter on the only margin. What you really need to know is that formula above.
Now that I knew this, I knew I was cooking with fire. We installed and powered up our new cluster and guess what? It worked great! We now have about 2,300 servers in our data center, but even when we are running our clusters flat out, the ambient temperature in the hottest parts of the data center near the walls of the cluster get to be only about 83-84 degrees fahrenheit. Now some of you IT types reading this might be thinking that this is bananas, that you would never tolerate operating a data center at this temperature. What you need to know is that the rest of the data center is perfectly fine, with ambient temperatures of 70-72 degrees fahrenheit just yards away from where the hot spots are. People aren't complaining about how hot it is or sweating their butts off anymore when they go in there. I have discovered over the past few years that computer equipment is often a lot more fault tolerant that many people are led to believe or are willing to tolerate (job or career wise) politically. In January 2007, I went to Algeria for my company and literally blew out cups full of desert sand out of servers which had stayed up and operational for 2+ years.
part II is to come shortly.
Ciao for now - Wizard.
At the insistence of my supervisor, I am attending a Linux cluster course at Georgetown University. I will be in Washington D.C. until the end of the week. Here are some first day notes I composed while watching the movie Jennifer Anniston will never watch - aka Mr. and Mrs. Smith - starring former hubby Brad and his new wife Angie Jolie:
The flight to Washington D.C. Went without a hitch. I was stopped by the TSA bureaucrats at Houston Intercontinental because I didn't take my laptop out of my carry on baggage. They lectured me for future reference.
Upon arriving, one notices how close the Reagan airport is to the rest of the city. The Washington D.C. area is absolutely booming, as befits a city in which 22 percent of America's economy flows through. Money fills the air. Office buildings loom everywhere. Georgetown is a very yuppified college town type area. Looking at all these massive bureaucratic temples makes me want to vomit.
I got lost going from the airport to the hotel. A government bureaucrat told me some general directions on how to get to the Georgetown area. Once I got here, I found my bearings on my own and wended my way to the hotel.
I found a Safeway supermarket on the Georgetown campus. Good news since I had to buy various toiletries. Hordes of people from International places and good looking college aged girls all making a mad rush to shop since it's the first week of the autumn semester.. It all makes me feel old - two of them called me sir.
I already miss Houston.
Sigh... More tomorrow.
This is my inaugural entry on Linux issues into my weblog. Briefly, though I have never written much about my current work on my website or in my weblog, I will say that I work as a systems administrator for a VLICA (Very Large Industrial Corporation of America). A bit more specifically, I administer a very large set of servers which are used for seismic data processing for the prospecting of oil and gas deposits. I also help administer much smaller clusters that are used for reservoir simulation studies and for visualization. Since all the easy oil and gas in the world has been found and dug up, we in the oil and gas industry now have to work for our $60 per barrel oil and $10 per 1,000,000 cubic feet of natural gas.
Recently, we at VLICA added 1,000 computers to our seismic exploration cluster. We have used Redhat Linux from day one of cluster building. To image such a large number of machines, we use a standard Redhat technique called "Kickstart", whereby you hook a USB floppy drive and a CD drive to a plain Jane server with nothing on it. You power the machine up and tell the BIOS to look for the floppy drive, then the CD-ROM for OS's in that order. The floppy has the instructions on where to look for a web server which in turn contains an image which tells how to image the server. The CD-ROM contained a 1GB ethernet driver which aids and abets the process of getting the server on the VLICA network.
Briefly, we were able to image 99% of our servers with no problem. At the very end of our imaging process, I was told by my boss to reimage some older servers which were exactly like the new servers (same model, same manufacturer, using the same network switches which had the exact same network port settings), but which had older versions of Redhat Linux on them. Update them, my boss said. And so I did.
Or so I tried. I hooked up the kickstart floppies and 1GB ethernet module CD-ROM to the servers and powered them up. The servers would POST (power on self test) and start the imaging process, but when it came to getting the machines on the network I would see the network driver being loaded from the CD-ROM but found I kept getting a strange message on the Linux consoles saying:
pump told us: No DHCP reply received.
As those of you who have ever imaged a Linux machine know, the point in the install process then leads to the prompt where you are expected to manually input the networking information, including the IP address, subnet information and so forth. For those of you who are not familiar with Linux, the kickstart process is a form of automation where the install looks for a file called ks.cfg which automatically answers the questions to which a manual install would require answers from the installer.
So what to do? Like many sysadmins, I was puzzled by this sour and unexpected turn of events. I naturally went to Google and punched in the term "pump told us" "no DHCP reply received" and started looking at the various mailing lists where this problem had been encountered before. Here is a partial sampling of such pages:
Here is one mailing list whose answers I found at several sites. For the record, I will repeat this fellow's advice as it is fairly good advice:
Could be any number of things. Basically, it's telling you anaconda
can't renew the dhcp lease. Examples of things that might cause it:
- Listing the wrong interface in ks.cfg (try eth1, if you have 2 nics)
- portfast being disabled on your switch (causes STP delays past the
anaconda threshold)
- dhcpd not running on dhcpd/pxe server (unlikely if you already
grabbed the initrd)
- not using the correct driver (what is your nic?)
- not having the correct driver on your ramdisk (try initrd-everything)
Just a few of the things that have bitten me over the years in our
pxe/kickstart/nfs build environment.
--
Jason Dixon, RHCE
DixonGroup Consulting
http://www.dixongroup.net
Now then, the ideas and advice given in the links above are all very good and worthwhile. I would highly recommend investigating what they have to say. However none of these items worked for me. My networking was just the same as every other server we had imaged, but I went ahead and checked all the network connections anyway - bounced ports, reseatred DRAC cards (the servers in question being imaged were Dell PowerEdge 1855 "blade" servers), and so forth, but again nothing worked.
Finally I went to my boss. He suggested looking at my boot floppies to see if there was nothing wrong with them. Briefly, we had ordered such a large number of computers that one idea we had was simply to have the vendor image them for us. Technical issues which I won't get into prevented this idea from being acted on and we had plenty of experience with imaging large numbers of computers ourselves. So we simply ordered something like 200 boot floppies and when we were finished imaging those machines, we edited the boot floppies so that they had a new server name on them and reused them. There should have been nothing wrong with these boot floppies I was using.
But there was. It turned out that the boot floppies I was using were apparently edited using a Windows Wordpad, Notepad, or Textpad program. Using these programs instead of using the classic Unix / Linux vi editor on a linux boot floppy added a ^M pair of characters to the end of each line of the ks.cfg file., like this:
zerombr yes^M
clearpart --all --initlabel^M
part raid.01 --size=120 --ondisk=sda^M
part raid.03 --size=8000 --ondisk=sda^M
part raid.05 --size=2000 --ondisk=sda^M
part raid.07 --size=2000 --ondisk=sda^M
Those of you who are familiar with ks.cfg "kickstart" files will recognize the code above as being from a kickstart file. Briefly the code above is supposed to zero out a master boot record and put one on, clear a partition table and write new partitions for the first scsi disk on a server. There are 4 raided partitions with their partition sizes in megabytes.
It turned out that my boss was correct. These boot floppies were probably edited with a Windows program and had the ^M characters at the end of each line. That threw the kickstart process for a loop when it came time to look for the kickstart process to look for the CD-ROM, load the ethernet module, and look for a DHCP server to get an IP address so that it could get on the network. Incidently, if one were to put the boot floppy on a Windows machine, these ^M characters would not show up. They only were viewable when you looked at the boot floppy with one of the Linux virtual consoles (the alt-F2 console if I remember correctly) during the kickstart process. To see these characters in the ks.cfg file, get onto the virtual console while your kickstart process is in progress, mount the floppy drive, change directory to the directory where the ks.cfg file is located and then vi to the ks.cfg file. You should see the ^M at the end of each line. If that happens to you, I would suggest simply getting a fresh boot floppy and creating a new kickstart file with a vi editor. When I did this, whalaa! My kickstarts worked perfectly.
Good luck and Regards
TMW