Udev and 26-Modprobe Mismatch?

alupu at verizon.net alupu at verizon.net
Thu Jul 5 18:00:22 PDT 2007


Hello Everybody,

SYSTEM
~~~~~~
686-pc-linux-gnu (B)LFS, 2.6.21.3,
UDEV-113, 26-modprobe.rules v.20070304.
Machine hardware pretty standard and run-of-the-mill.
Nothing complicated or fancy (including peripherals).
I'm not aware of any problem, the system just works.
BTW, no USB standalone devices.

SUMMARY
~~~~~~~
I believe there might be a problem between our (LFS)
handling of "aliases" (in '26-modprobe.rules'),
the kernel set up, and Udev behavior.

DETAILS
~~~~~~~
This topic stems from a little conversation
Dan Nicholson and I had in a recent thread called
"Udev Questions" (end June - excerpts):

- Me (Alex):
>  In 'udev' I noticed that appending '--timeout=3'
>   to 'udevsettle' speeds up the boot considerably
>   while not missing any devices in doing that.

- Dan 2007/06/28 Thu AM 09:31:57 EDT:
I don't recall exactly how udevsettle works, but if
you specify a timeout, that's exactly the maximum
time udevsettle will run for.  The default is 180. 
Otherwise, udevsettle checks '/dev/.udev/uevent_seqnum'
and '/sys/kernel/uevent_seqnum' 20 times per second.
It runs until the value in /dev matches the value in /sys.
So, if it's much faster for you, then I don't think you've
waited until all the pending uevents have been processed.
If you're curious, you can set UDEV_LOG=info for udevsettle.
This would show if the timeout is reached or the queue is
empty in syslog. If you don't process all the events
immediately, then it's entirely unknown whether to proceed,
although probably OK at that point.
----------------
Dan got me to thinking, I decided to dig deeper and get
some answers to a few nagging questions.  Among them:

Q1.  How come I'm not missing any device despite the
artificially "short" timeout I set for 'udevsettle'?

Q2.  Why every once in a while the system "hangs" for
a minute or so (looking like a deadlock among the
little threads Udev forks around to do some work)?

Q3.  Why I always end up with the same 11-entry 'failed'
directory in '/dev/.udev'?  Come to think of it, why
should I put up with any failures at all?

I know a lot more now (not enough, though); I don't
have all the answers nor the perfect solution, but
at any rate, what I discovered is quite confusing.

After CONCLUSIONS please find the supporting
documentation (in all the gory details).

CONCLUSIONS
~~~~~~~~~~~
C1: On boot, Udev populates '/dev' with all necessary devices
    in less than a second (way shorter than a 3-sec timeout)
    In other words, when 'udevsettle' starts, 'devtrigger'
    has long finished its appointed duties.

C2: With the present logic, '26-modprobe.rules' forces Udevd
    to fail certain "suspect" devices/drivers, thus creating
    unnecessary "noise", extra work and possibly deadlocks.
    On my present system the "suspect" devices are always
    the same 11 culprits, always ending up in the
    'failed' subdirectory (EVENT_FAILED_DIR).
  
C3: I'm hesitant to ask, but why o' why does the kernel
    create '/sys/.../modalias' files for kernel-built
    (as opposed to module) drivers?
    Am I totally out of line?
    A typical example is our perennial favorite, "floppy".
    a) It is a kernel driver:
      DD > Block devices > Normal floppy disk support = Y
    b) No trace of "floppy" in
       '/lib/modules/2.6.21.3/modules.alias'
    c) but 
      []$ cat /sys/devices/platform/floppy.0/modalias
          floppy

    From a '26-modprobe' standpoint, it seems that solely
    relying on 'modalias' files gives you nothing but grief.

C4: Until something better comes along, I'm using a
    temporary(?) solution consisting of 9 extra rules
    inserted at the top of '26-modprobe.rules' (not 11;
    I could combine a few, like "serio:ty0*"):

 ACTION=="add", ENV{MODALIAS}== \
  "pci:v00001039d00000001sv00000000sd00000000bc06sc04i00", GOTO="modprobe_end"
 ... 
 ACTION=="add", ENV{MODALIAS}=="floppy", GOTO="modprobe_end"
 ...
    
DOCUMENTATION
~~~~~~~~~~~~~
D1:
[]$ ls /dev/.udev/failed
\x2fdevices\x2fide0\x2f0.0
\x2fdevices\x2fide0\x2f0.1
\x2fdevices\x2fpci0000:00\x2f0000:00:01.0
\x2fdevices\x2fpci0000:00\x2f0000:00:01.0\x2f0000:01:00.0
\x2fdevices\x2fpci0000:00\x2f0000:00:02.0
\x2fdevices\x2fpci0000:00\x2f0000:00:02.1
\x2fdevices\x2fpci0000:00\x2f0000:00:02.5
\x2fdevices\x2fplatform\x2ffloppy.0
\x2fdevices\x2fplatform\x2fi8042
\x2fdevices\x2fplatform\x2fi8042\x2fserio0
\x2fdevices\x2fplatform\x2fi8042\x2fserio1

D2:
Output of a little applet which prints the kernel name
and alias of any driver that '26-modprobe.rules' "sees"
with its first rule, ACTION=="add", ENV{MODALIAS}=="?*",\
 RUN+="/sbin/modprobe $env{MODALIAS}"

Name	        '/sys/.../modalias' file content
----            --------------------------------
0.0 =>          ide:m-disk
0.1 =>          ide:m-disk
0000:00:01.0 => pci:v00001039d00000001sv00000000sd00000000bc06sc04i00
0000:01:00.0 => pci:v000010DEd00000326sv00000000sd00000000bc03sc00i00
0000:00:02.0 => pci:v00001039d00000962sv00000000sd00000000bc06sc01i00
0000:00:02.1 => pci:v00001039d00000016sv00000000sd00000000bc0Csc05i00
0000:00:02.5 => pci:v00001039d00005513sv00001043sd0000807Abc01sc01i8a
floppy.0 =>     floppy
i8042 =>        i8042
serio1 =>       serio:ty01pr00id00ex00
serio0 =>       serio:ty06pr00id00ex00

FWIW,
 1039 = SiS, 10DE = nVidia
 Dev_id=0001 with PCI_Class=060400 is the PCI-to-PCI bridge (AGP),
 etc.

Some spot checks:
[]$ cat /sys/devices/ide0/0.0/modalias
ide:m-disk
[]$ cat /sys/devices/pci0000:00/0000:00:02.5/modalias
pci:v00001039d00005513sv00001043sd0000807Abc01sc01i8a
[]$ cat /sys/devices/platform/i8042/serio1/modalias
serio:ty01pr00id00ex00

[]$ grep serio /lib/modules/2.6.21.3/modules.alias
 alias serio:ty02pr18id*ex* warrior
[]$ grep 8042 /lib/modules/2.6.21.3/modules.alias
[]$ grep floppy /lib/modules/2.6.21.3/modules.alias
[]$ grep ide /lib/modules/2.6.21.3/modules.alias
 alias ide:*m-cdrom* ide_cd

Notice above the one-to-one correspondence between what
"26" handles and the devices sent to the 'failed'
repository by some merciless 'udevd' thread like this:

udevd-event[1228]: run_program: '/sbin/modprobe floppy'
udevd-event[1228]: run_program: '/sbin/modprobe' \
             (stderr) 'FATAL: Module floppy not found.'

D3:
Each time you do this you get the same result
(i.e. the same 11 usual suspects):
[]$ udevtrigger --retry-failed --verbose
/devices/ide0/0.0
/devices/ide0/0.1
/devices/pci0000:00/0000:00:01.0
/devices/pci0000:00/0000:00:01.0/0000:01:00.0
/devices/pci0000:00/0000:00:02.0
/devices/pci0000:00/0000:00:02.1
/devices/pci0000:00/0000:00:02.5
/devices/platform/floppy.0
/devices/platform/i8042
/devices/platform/i8042/serio0
/devices/platform/i8042/serio1

Another spot check:
[]$ modprobe floppy
FATAL: Module floppy not found.
[]$ modprobe i8042
FATAL: Module i8042 not found.
[]$ modprobe ide:m-disk
FATAL: Module ide:m_disk not found.

D4:
After the run of 'udevtrigger', on boot:
udevsettle[408]: main: queue is empty
udevsettle[408]: main: udev seqnum = 502
udevsettle[408]: main: kernel seqnum = 502
udevsettle[408]: main: queue is empty and no pending events left

D5:
In the kernel configuration, when I had 
"8250/16550 and compatible serial support" = Y
there was an extra 'udevtrigger' failure:
 /devices/platform/serial8250
with the corresponding additional entry
in '/dev/.udev/failed':
   \x2fdevices\x2fplatform\x2fserial8250.
Strange.  Compiled as module (8250.ko, etc.)
the problem disappears!:
[]$ modprobe 8250
  Serial: 8250/16550 driver $Revision: 1.90
  $ 1 ports, IRQ sharing disabled serial8250:
  ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

Thank you for your patience.
Obviously, I may be totally wrong in my
approach and/or findings, so
Help and Comments are gracefully invited.

-- Alex



More information about the lfs-support mailing list