Troubleshooting bootproblems


#1

Hi,

my ETC does no longer boot properly :frowning:
It boots up to the state where the big “ETC” is shown the first time. After a second or less the LED turns to red.

I restored the complete content of the /root/ETC directory from a backup, which is known to be working, but it did not change anything.
Is there anything written to a logfile or nohup-file? (I didn’t find anything)

If not, then I’d like to redirect the std-out and err-out of all the scripting to a logfile. But for that I need the start order of the ETC Subsystem.
I assume everything is started via a systemd-service, which then calls the setup.sh, and other scripts. But analyzing that by trial and error is very … ummmh … tedious. Can you list me startup-process of the ETC application (not the OS itself)?

Thanks in advance for your help

Florian


#2

Since it’s using systemd you can use journalctl to get log messages.

What I’d do if I were you is to grab the etc system image and start a fresh, by burning it onto an sdcard using etcher.io on your desktop.
(unless your really interested in understanding why it failed, which could be time consuming )


#3

Hello Technobear

Thanks for the hint with journalctl (I come from old SystemFive-ish UXes like HP-UX and Solaris, so this systemd-world is still new to me). Unfortunately it seems, that the system.journal is written to the file when syncing the filesystem at the shutdown. Since I cannot shutdown the ETC properly (it hangs in the red LED status), I have no logfile…

Unfortunately there seems to be no image of the complete sdcard available. On github there are only the various subdirectories, which I have already in my backup.

Florian


#4

from your description, Id say that the python process is failing… Ive seen this on OTC too for some users, but far from clear why as its never done it on my image.
I say this as the initial ETC screen is done by direct access to the frame buffer, you then see the modes loading when the ETC runs python2 etc.py.
what has happened for those users is the Organelle is completely locked up, as the kernel appears to crash… so things like wifi/console stop working

there are a couple of other ‘tricks’ that may be handy
a) you should still be able to use wifi to login if its not a kernel panic
b) a usb stick with run.sh on it, will run instead of the python process (see scripts folder) , so allowing you in
c) load the sdcard on a linux desktop machine, then alter the start scripts, again to let you in, before it crashes

does the display a console if you have a keyboard/hdmi monitor, thats the other way, but again you might need to stop python running to stop it kernel panic’ing

etc disk images : http://thepeacetreaty.org/etc/diskimages/

you might want to confirm with @oweno / @chrisk these are ‘current’, as Ive not used them, I just stumbled on them, as its essentially same url as organelle :slight_smile:


#5

Hi

I think the hang happens before starting python (opr with starting python).

Nope. Neither run.sh nor WiFi.sh is executed at all (echo bla > /usbdrive/bla.log doesn’t do anything).
I think /root/ETC_Sys/scripts/setup.sh isn’t executed at all.

I think the boot is hanging somewhere in the services startup. I thought about disabling the services, but therefor I would have to do a chroot. But I cannot do a chroot on my linux installations, because all of them are 64bit and the ETC is 32bit.

I’ll try the diskimages. Thanks for the link!

I will report.

Florian


#6

so what I would do is mount the sdcard on your linux box, and edit the fbinit.sh and setup.sh to ‘do nothing’
(or just rename and the service will then fail :wink: )

(to edit the files, doesnt matter what version of OS you use on your desktop, its just needs to support the ext fs)

that will allow you in, then you can manually run thru the steps to see what fails.

I still think its setup.sh thats failing…

if you look at the last line of fbinit.sh
cp /root/ETC_Sys/scripts/splash /dev/fb0
this is why you get the big ETC log, so that is obviously working.

ok, if could be fbthing.service is not running , but then you shouldn’t be locked out via hdmi/keyboard,

but if its is run, then you should be able to get the usbdrive/run.sh to work…
and really the only thing in setup.sh that is likely to ‘hang’ is python :wink:


btw: just looked at ETC_Sys/setup.sh
I wonder if the usb run.sh functionality has a bug,
this will only work IF the usb drive is set to auto mount,
if you look at the else clause it calls mount.sh , so assumes it is not auto mounted.

so perhaps try to move the mount.sh to the top of the script, and see if run.sh then gets called.

( i dont have an ETC available to verify if usb drive is set to auto mount or not)

btw: what happens if you do not have any usb drive inserted? do you get a ‘no modes found’ message?

(it could be the usb filesystem is corrupted/unreadable by ETC, and this is just causing python to hang when it tries to load/access the modes)


#7

Just a short reply (more tomorrow): I tried the two images from thepeacetreaty.org/etc/diskimages/.
The one from 2016 runs, but it is appearently a very early alpha version (it does not react on the potentiometers, only on the buttons).
The image from 2017, does not boot over the ETC screen, but does not result in a red LED.

Regarding the setup.sh, I think you are right.

I hope @oweno or @chrisk will provide an up to date image. More tomorrow!

Good night.


#8

OK, I tested a little bit further on: None of the shell scripts in /root/ETC_Sys/scripts seems to be invoked. I added to each script
echo "scriptname" >> /root/bla.log
as second line after the starting #!/bin/sh, but the file /root/bla.log isn’t even written.

So I am calling @chrisk or @oweno : can you please place an image for the operatingsystem sdcard for download somewhere? That would be extremely helpful for me. Thank you!

Best regards
Florian


#9

The root filesystem is not writeable by default, you need to remount it as read write


#10

Ha! That came to my mind the second after I had hit “send”; so I tested with /tmp/bla.log. But still nothing… :frowning:


#11

/tmp is a ramdisk (afaik)
your best bet is to just try mounting the usbdrive, and writing to that.


#12

Yes! I am sorry that it was not up there. I just uploaded it:

20170310-etc.img
20170310-etc.img.sha1.txt

http://thepeacetreaty.org/etc/diskimages/


#13

Perfect. Download has started.

Aaargh - or even double arghh, because this eats also RAM, which we need desperately for storing images.

Tried that, but the hang appearently happens before the mount.

I have the image of the broken installation. So I will try to diff the two filesystems to find out what caused the hang.


#14

Hello

Thankyou! That solved the actual problem. The ETC boots up fine again.

I will compare the two images the next days and hopefully be able to report back, what cause the trouble.