#qi-hardware IRC log for Saturday, 2012-06-09

rohoomk are stupid anyhow.00:36
qi-botThe build has FAILED: http://fidelio.qi-hardware.com/~xiangfu/building/Nanonote/Ben/openwrt-xburst.full_system-20120608-0725 03:22
viriccan it be that I have a hang on oomk, because I have the syslog outputting to a tmpfs? maybe it needs to allocate tmpfs pages to save what oomk printks.07:29
viricand deadlocks.07:30
viricmh it may be related to reiserfs too... grr07:32
rohviric: people using reiserfs wondering over strange system behaviour are weird anyhow ;)08:17
rohviric: you could try syslogging to udp network and let another host on the local lan dump it to disk. that way you may have a chance to see your logdata08:20
viricroh: yes, I'm building the netconsole module now08:29
viricroh: reiserfs worked very stable for me for the last 10 years, but in 3.3 and 3.4 I'm seeing deadlocks in it08:30
viricstable = no deadlocks, never lost data.08:30
viricreiserfs was famous for using the BKL a lot... and I think that recently someone broke up the locks into something more scalable. And maybe there are deadlocks now.08:31
viric(scalable in terms of SMP)08:31
rohdunno. i am happy with ext3/408:38
Aylax-Just out of curiosity, why are you using reiserFS?08:38
viricbecause I lost data with ext3 :)08:39
viricseveral times08:39
viric(on system hang, and things like those)08:39
roheh. no. that doesnt happen with unbroken hw. dead ram/diskcontroller/etc yes... but there is raid and ecc against that08:40
viricbut I can't tell for ext4. Never used it still. But I don't trust at all the promises of ext3 journaling08:40
viricnah, the hw has always been the same. ext3 - corrupted files. reiserfs, never corrupted files.08:40
viricBut maybe ext4 fixes all those things.08:41
viricI don't know.08:41
viricI'm starting to use ext4, due to the recent deadlocks in reiserfs.08:41
viric2 days of ext4 already; I'll tell :)08:41
viriclast time I tried ext3, it took me 3 days to have my boot scripts full of binary data.08:42
viric(that was around 2.6.34)08:42
rohhuh? without crashes?08:42
viricYes, there were crashes08:43
rohdue to what?08:43
viricwell, I was setting up a sheevaplug system... and sometimes I got left without network, ... and I powered off the device08:43
rohive only seen such things with broken blockstorage happen. means.. bad diskcontrollers, etc. anything that makes the blocks not keep state or corrupt below the fs layer.08:43
viricI thought the journal would hold all fine.08:44
rohsheeva is flash. there should never be ext on there without ubi or something similar between that and mtd.08:44
viricWell, I remember experiencing the same kind of corruption on PC over spinning disks when I had used ext3 on PC... so I conclude that should be ext3. And replacing the fs with reiserfs always solved for me any troubles. 08:45
Aylax-They can't just use ext3 on top of mtd08:45
viricroh: no, I don't use the flash. I used ext3 over SD.08:45
viricI used ext3 only temporarily, because I wanted to setup the initrd to have reiserfs modules. And before I could do that, the ext3 got corrupted :)08:46
rohsdcards are similar fails when it comes to blocks. there is a dumb remapper in there but its usually slow. on sd make sure the journal is 'in the front' where the FAT usually resides on msdosfs08:46
viricroh: lots of people tell me that ext3 worked fine for them08:46
rohgives better performance and less possibility of loosing data due to unrecoverable lost blocks08:46
viricwell, my experience is that I lost data with ext3 (always *soon*), and never with reiserfs, no matter how bad I powered off the systems08:47
viricso I don't care much about ext3 promises. :)08:47
Action: roh needs to learn more about ubifs and ubi for such cases. usually i got squash and jffs2 on small stuff08:47
viricIf I had lost data only once, I could doubt. But it happened several times in my sporadic short usages of ext3. 08:48
viricBut you'll find people that tell you the opposite: lost data with reiserfs, and never with ext3.08:48
viricWhy that? I've no idea.08:48
rohi did a test how good journaling keeps data sane about 10 years ago... reiser didnt even perform as good as xfs and i got that do 'crashing kernel on boot'08:48
viricMost people that told me that ext3 worked fine for them, they use UPS and stable systems. :)08:48
rohwe needed something for the dvr storage drive inside a set-top-box08:49
viricreiserfs3 you mean, right?08:49
rohso i tested it by repeatedly unsafe shutdowns under write-load08:49
viricwell, my trouble with xfs is that it can't shrink08:49
rohbck then? dunno. 10-11 years ago. long time. but it was bad. through the board. ext always came up. took time to fsck (no ext3 back then)08:50
viric(and they have those weird (although correct) semantics that used to leave 0-byte files08:50
rohbut didnt loose data or even segfault on recovery like xfs.08:50
rohreiser corrupted but always came up i think08:50
viricroh: ah you mean xfs segfaulted?08:50
viricI thought you meant reiserfs segfaulted, and xfs not.08:51
rohfsck.xfs was a 'exit 0' dummy and the kernelcode to 'fix it on mount' just crashed the kernel. not nice08:51
viricyes, xfs never had fsck in the sense of 'run before mounting'08:51
viricand they use that 'exit 0'.08:51
viricThe same does btrfs.08:51
rohreiser actually had a 'if the kernel says its too bad to mount' fsck tool which even worked, but corrupted some files.08:52
rohi have high hopes for btrfs. didnt test it tho.08:52
viricI'm using it in my home and office computers...08:52
viricIt performs very bad for files that get randomly written   (vm images, databases, ...)08:52
viricbecause that gets very fragmented.08:52
rohthe only think i am kinda annoyed of is sometimes the performance of my badly overcrowded ext3 disks in my workstations. everwhere else i have no issues.08:53
viricfor the rest, fine.08:53
viricAh, another deal with ext* filesystems.... I often ended up finishing the inodes.08:53
viricWell, for my taste, *too often*. I wouldn't expect never to happen.08:53
rohhm. never had that. (ourside of quotas)08:53
virichaving free space, and not being able to write files.08:53
viricreiserfs does not have inode limits08:54
viricFinally I used ext3 only for /boot, due to broader loader support :)08:54
viricroh: different users, different experiences! :)08:55
viricanyway, why the sheevaplug deadlocks on OOMK, we'll see once I have netconsole.08:55
viricI never used it before.08:56
virichm is it me, or their logo http://www.plugcomputer.org/ looks like a man going to poo, seen from the back?08:58
viricwell, not necessarily a man. but with a hairy bottom.08:58
Aylax-Once you see it ... x_x08:59
viricweird. :)09:00
rohviric: i dont own one but i have heard the shivas do like to overhead and corrupt ram09:23
virichm no, I can reproduce the troubles quite quickly09:23
viricI never saw ram corruption - my only hangs are always related to oomk09:24
viricroh: and it's not super-hanged... Just in deadlock. sysrq works fine.09:24
viricroh: do you want to see the backtraces? I'll prepare a txt09:29
viricroh:  http://sprunge.us/XdBN09:30
viricoops that lacks the out_of_memory09:31
virichttp://sprunge.us/GdBO    that's from the same session, where only sysrq answers.09:32
rohi'm a bit confused... reiser needs to dynamically allocate memory to get a write lock for sth. on disk?09:43
rohatleast thats what that dump says to me09:43
rohor is that just incindental in there since something else tries to get ram and its swaping to a file on reiser? /me confused09:44
viricwhat makes you think it needs to allocate memory?09:44
viricI think that the 'vm' threw away read-only pages from memory, expecting that they can be read from disk again when needed. 09:45
viric(.text sections for example)09:45
viricThen, there are 'page faults' that trigger reiserfs_get_block.09:45
rohyeah. well.. why does it try "reiserfs_write_lock_once" then`09:46
rohdo you need a lock to read sth?09:46
viricreiserfs has a lock to read, yes.09:46
viricwho owns that lock, I've no idea...09:47
viricI should have built kdb...09:47
rohit still weird that this doesnt happen in ram the kernel doesnt give away (static buffers)09:48
viricwhat should happen in ram?09:48
viricI think that the culprit is having one process locked at "out_of_memory"09:49
rohhm. not 'in ram' 'in memory which is never unavailable to the kernel' .. like static buffers.09:50
viricbut what? what should be using static buffers?09:50
rohi find it kinda odd for fs code to depend on dynamic memory allocation when that(the fs code) is what needs to work if you run out of free memory (need to swap)09:50
viricWhat makes you think reiserfs depends on dynamic memory allocation?09:51
rohit should USE memory for cache (as the fs cache does) but not depend on it.09:51
rohviric: the trace. or do you see anything else in there?09:51
viricI told you09:51
viric11:45 < viric> I think that the 'vm' threw away read-only pages from memory, expecting that they can be read from  disk again when needed. 09:51
viricpage faults that trigger read from disk. Nothing more. What do you see about dynamic memory allocation?09:52
rohviric: in your other sysrqs09:53
rohwell. the oops09:53
viricthere are no oops09:53
viricsomehow I always related these hangs to tmpfs, not reiserfs...09:55
rohhm. well.. why is the reiser in some mutex then? or is that just a snapshot (the lower trace)09:55
viricmaybe the trouble is the out_of_memory being called in a pagefault exception, simply.09:56
viricroh: snapshots. sysrq-regs and sysrq-blocked, simply09:56
rohthe lower parts of both traces look kinda similar09:56
viricmaybe out_of_memory should never be called from a page fault exception! could it be that?09:56
viricthat would be a broad kernel bug though09:56
rohup to [<c0068be4>] (filemap_fault+0x1e0/0x4b0) from [<c007e3dc>] (__do_fault+0x68/0x4b4)09:57
viricroh: yes, that's the page fault09:57
roh__alloc_pages_nodemask is called from the fault, and thats ooms09:58
viricI see. maybe that's the cause of deadlock: trying to allocate from a page fault09:58
rohdo you have weird overcommit settings maybe?09:58
viricno no, default.09:59
rohsigh. i'm out of ideas10:01
viricit's pretty easy to reproduce: I don't have any swap at all, I've some tmpfs for /var/log and /tmp, and just have some process ask lots of ram10:02
viric1st oomk works fine, 2nd too...10:03
viricand around 3rd or 4th, it deadlocks10:03
rohdoesnt make it better. a system should never have to oomk. doesnt help make stuff work better.10:07
viricroh: same situation? http://www.mail-archive.com/android-developers@googlegroups.com/msg147682.html10:08
viricsearching the www for "filemap_fault out_of_memory" gives deadlock situations10:08
rohonly leads to stuff like "sshd adjusting its /proc/self/oom_adj10:09
viricI only find 'crashes' :)10:10
virichm no, some people have succesful oomk.10:14
viricroh: usually, all my processes have oom score = 0.10:15
viricThat may mean that the oomk can't find any process to kill, and then it hangs10:15
viricOr maybe all processes with oom_score > 0  are locked in reiserfs :)10:16
Aylax-Speaking about filesystems. What FS would you recommend for a very slow SSD?10:23
viricat this point, I've no idea :)10:23
viricI'd say all fs work bad ;)10:23
rohAylax-: do you need rw or is ro ok?10:23
Aylax-roh, I'd prever r/w10:24
viricAylax-: btrfs has some wins, in terms that it uses to write big new blocks always (due to COW)10:24
rohslowness doesnt really matter. you can always add more ram to cache that away. is it big?10:24
rohif you have some embedded device which only needs rw for system updates and storing config, i think the best combo is the openwrt way. squashfs for the initial userland and jffs2 fot the rest (configs, later installed packages)10:25
Aylax-No, I can't add more RAM, the mother board does not accept more than 2GB10:26
Aylax-It's my ACER AAO 11010:26
Aylax-8GB of a super-crappy SSD10:27
viricI've a crappy 32GB SSD :)10:27
rohwell.. then swap it for a cheap 2.5" hdd10:28
rohthose are faster than slow ssd mostly. and a lot cheaper and bigger10:28
viricAylax-: why ask for software solutions, when you can have hardware replacement solutions? ;)10:28
Aylax-viric: I tried btrfs, was very bad10:29
Aylax-Because of one thing: fsync10:29
rohi still use them in my thinkpad... proper ssd is much too expensive for my taste (and also fails like harddisks sometimes)10:29
Aylax-roh: there's no space for a 2.5" hdd :-)10:29
Aylax-1.8" could fit10:30
rohAylax-: huh? the acer webpage said that its either coming with a 2.5" hdd or a ssd10:30
Aylax-Yes, but the SSD version is a bit different inside10:30
Aylax-The SSD is a tiny PCI-E card10:31
rohand what happened to the disk slot the other model has?10:31
viricAylax-: did you try it in recent kernels? it has lots of development10:32
Aylax-No, I should try again10:38
Aylax-Do they have a working fsck now?10:38
Aylax-And support from Grub 2?10:40
viricthey have some fsck, but most of recovery is done in-kernel, mounting with "-o recovery"10:43
viricsomeone wrote grub2 support, but it was someone at grub, not at btrfs.10:43
viricgrub2 has an incompatible license with btrfs10:44
viricand so grub2 people have to write a btrfs reader not reading btrfs code.10:44
viric(gplv3 vs gplv2)10:44
viricI imagine it's not only the btrfs case.10:44
viricif grub2 people want to keep the gplv3 label in their code, they can't pick gplv2-only code.10:45
Aylax-The problem I had with btrfs is that it took hours just to install a random package on Debian10:47
viricI know apt does a lot of fsync10:47
viricthey've been working on that I think10:48
Aylax-I tried last summer I think10:53
whitequarkroh: iirc jffs2 cannot work on non-mtds12:21
Aylaxwhitequark: how does it compare vs. UBI12:29
Aylaxubifs I mean12:29
rohwhitequark so what?12:32
rohwhitequark: if i dont have mtd i usually dont need jffs2 ;)12:32
whitequarkAylax: ubifs is significantly faster than both jffs and especially yaffs on big mtds12:37
whitequarkroh: the ssd in that notebook isn'12:37
whitequarkt an mtd, but is a SATA drive12:38
whitequarkor something like that12:38
whitequarka block device.12:38
whitequarkAylax: while some of jffs/yaffs have a mode for block devices (I don't recall details, but there was something like that), ubifs only works on mtds12:38
viricubifs works only on ubi, and ubi works only on mtds.12:40
viricsomething like that.12:40
rohwhitequark: yeah. true. but there is already badblock and ecc management in the ssd12:44
rohwhitequark: so one doesnt need jffs features12:44
whitequarkviric: well, I do not know of anything else that works on top of ubi13:08
viricme neither13:08
rohubi should provide a 'cleaned' blockdevice upstairs. similar to regular disks13:37
rohit does badblock reallocations and handling as well as write balancing for upper layers like ubifs.13:38
rohjffs2 does that internally afaik.13:38
rohboth need mtd below (ubi as well as jffs2)13:38
rohsquash runs better on ubi afiak13:39
viriccan it run on ubi then?13:41
viricor ubifs you mean?13:41
rohviric: on ubi afaik14:12
larscyou can also run jffs2 ontop of ubi15:25
larscsquashfs is read-only so you'd normally not need ubi underneath it15:26
rohlarsc: well.. if you got badblocks...15:26
rohafaik squash cannot deal with holes on its own15:26
rohon small nor thats usually not an issue, but on nand it is for sure (only the first block is guranteed to be biterror-free on sale)15:27
viricdoesn't anyone happen to know how can a serial port (16550 based, in a PC) can be rendered unusable? All operations to it give EIO15:34
viricdmesg shows ttyS0 and ttyS1 (as usuual), but /sys/.../serial8250  only show ttyS2 and ttyS3. weird.15:35
viricstty -F /dev/ttyS1 worked first, but after some work, it only gives EIO.15:35
viricmaybe it's faulty hw... but I'd expect the serial port always to work15:35
larscviric: cd drivers/tty/serial; grep EIO *15:36
viriccouldn't be easier15:36
larscwhat you see there is that all operations return EIO if the TTY_IO_ERROR flag is set15:37
viricand how could I reset that?15:37
larscaccording to the code reopen the device15:38
viricthere was a moment where, more or less, one of every "stty -F /dev/ttyS1" worked fine, the rest gave EIO15:38
viricnah, it wasn't that. lsof said it wasn't opened by anyone15:38
viriclarsc: in http://www.easysw.com/~mike/serial/serial.html, it says it can give EIO in case DCD is not up15:41
viricwell, strange. I ended up rebooting the computer to get it back :)15:43
viricthank you for the hints15:44
viricmaybe increasing the loglevel could say something15:45
qi-bot[commit] Werner Almesberger: modules/Makefile (MODULES): add bat-clip-aa-th (master) http://qi-hw.com/p/kicad-libs/8d40b3823:10
qi-bot[commit] Werner Almesberger: modules/pads-array.fpd: like pads.fpd, but in a array formations (WIP) (master) http://qi-hw.com/p/kicad-libs/86ce0c023:10
DocScrutinizer05wpwrak: http://www.youtube.com/watch?v=9Ww1RH8iAR4 :-D23:11
DocScrutinizer05(my new toy)23:11
wpwrakhow to dramatically improve the personal hygiene of the average hacker ;-)23:16
--- Sun Jun 10 201200:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!