[lfs-support] Configuring GRUB2--Request for a Logic Check

Ken Moffat zarniwhoop at ntlworld.com
Fri Nov 8 18:35:35 PST 2013


On Fri, Nov 08, 2013 at 06:17:44PM -0600, Dan McGhee wrote:
> I have everything in place to boot LFS in EFI-mode.  The problem is that 
> something is failing and the messages go by so fast that I can't see 
> what it is.  Just like I have it set up, the laptop boots to Ubuntu if 
> the LFS boot failed.  My problem is that I'm stymied in my troubleshooting.
> 
> I've reasoned that the problem is probably in my grub.cfg.
> 
[ snip EFI / grub.cfg details, I'm going to query your reasoning ]

 One initial question: you are getting kernel messages when LFS
fails to boot ?  If so, in a conventional bios boot the showstopper
is usually something like "trying to kill init" and that usually
happens because rootfs= is wrong, or the device drivers and
filesystem are missing.

 You have root=/dev/sda6, so I will be very surprised if you have
fallen into that hole.

 Hmm, does it change to booting ubuntu _immediately_ if the LFS boot
fails ?  If so, I can understand why you can't see the messages, but
I've no idea how to address that.

 My reason for questioning if this is a grub.cfg issue is that I've
been reading the kernel mailing list (lkml) since before I ever
found BLFS or LFS.  Some days I skim most of the posts, but in the
last month or two I've seen a lot of posts between kernel developers
about (U)EFI.  My impression is that the situation is still very
fragile [to be fair, most of the threads were primarily about signing
the kernel for secure boot], and I recall reading that many commits
which fixed things for some people broke things for others.  In
particular, I had the impression that many apple machines seemed to
disagree with fixes that worked on "regular" Windows EFI x86 and
(perhaps) on arm.

 So, I think it might be a problem either with the kernel config, or
with the kernel version.  Since you can't boot, that isn't a very
helpful observation.  Can you do anything to pause it after it fails
to boot LFS ?  On all the machines I've ever built for (x86 with
bios, and various ppc) the automatic reboot happened about 3 minutes
after the boot failed, which gave me time to note some of the
visible messages.  Hmm, if this is a laptop then I suppose the
screen might not have enough rows to see anything useful.  Perhaps
the same if you got an oops and the kernel state was dumped.

 I *guess* a way to approach this might be to take the .config from
ubuntu (/boot/config-3.x.y-z-something), work out what needs to be
built-in instead of modules [ lsmod on ubuntu - if it loaded, you
probably need it, at least until you can boot ], and then take a
similar kernel version and try that.

 What I mean is, IFF you are running e.g. 3.2.0-34-generic on ubuntu,
then first try 3.2.0, but also try 3.2.latest.  If either boots, it's
a start.  If both boot, go with 3.2.latest.  If neither boots, I've
probably wasted your time.

 If only 3.2.0 boots, perhaps try "broad brush" bisection between
3.2.0 and 3.2.latest [ I don't know what the latest 3.2 SUBLEVEL is,
my oldest running kernel is now 3.10.something, and anyway you might
not be running 3.2 in ubuntu ].

** after you can boot, first try 3.10.latest with that .config as
the input to make oldconfig please, in case that works - if so, the
job is done **

 e.g. if latest is .100, for "broad brush" bisection try .50, then
either .25 or .75 according to whether or not that boots.  Depending
on available time, you might get to the latest version of _that_
kernel which boots.  That would indicate that one of the commits in
the next SUBLEVEL broke it.

 Hmm, ubuntu are doing long-term maintenance of some kernels, of
which I think one is 3.8 and I'm not sure about the other - I think
they are adding their versions at the fourth level [ EXTRAVERSION ].
Whatever, the first step is to try to get a kernel which boots.

 If you can achieve that, the next step is to move through the newer
still-maintained versions.  I think the following are still
maintained officially - 3.2, 3.4, 3.10.  If it is an "errant
hardware" problem I will have to suggest that you might need to try
ALL the various SUBLEVELs of whichever of those series are newer
than what ubuntu runs.  Hopefully .0 will work, but it is possible
that a later SUBLEVEL fixes an error, only for a still-later one to
break it again.

 Actually, once you have a .config that lets you boot LFS, I guess
the most efficient thing to do would be to try 3.10.latest with that
config, in case it works.  [ I've added the '**' sentence above ].

 I'm sorry this sounds painful (it probably will be), but EFI is not
yet something that most linux users are using.  For the BIOS, it
took years to sort out some of the weirdnesses.  And I'm sorry, my
suggestion to try 3.10.latest once you have a working older .config
will short-circuit a lot of what I've written (if it works).  I'm
sure you can find an appropriate path through the twisty maze of
kernel versions, all almost alike ;-)

ĸen
-- 
das eine Mal als Tragödie, dieses Mal als Farce



More information about the lfs-support mailing list