#qi-hardware IRC log for Friday, 2011-10-21

wolfspraulwpwrak: looking at the slashdot post, the # of comments does show that people there don't care01:10
wpwrakwolfspraul: or maybe they just couldn't think of anything nasty to say :)02:40
wolfspraulwpwrak: are your nor corruption tests still running?02:44
wpwrakoh yes, very much. i now have a run with a success period of ~10500 cycles. but that's the same hardware that before failed on average every 600 cycles.02:46
wpwrakyeah, it doesn't quite make sense. this brings back the theory of an external factor02:47
wpwrakthe weather is getting warmer now. so maybe it's really just that.02:47
wpwrakso the next step should be to move my M1 to a cooler place. didn't have time for this in the last days (had to do a bit of politicking :)02:49
wolfspraulyes I understand02:53
wolfsprauljust a little worried that we get stuck02:53
wolfspraulso I also think maybe we just leap ahead and apply everything we plan for rc4 anyway already, and then try to reproduce the corruption again02:54
wolfspraulthat will not satisfy the enlightened scientist inside you though :-)02:54
wpwraki think the general plan doesn't change. the pull-ups definitely sound right, even if they don't affect this case of NOR corruption.02:54
wpwrakoh, i think the changes we've discussed all make sense.02:54
wpwrakwhat i'm looking for is an indicator for whether we;ve actually killed the bug02:55
wpwrakand that's the problem with probabilities that jump around wildly. if the bug is simply not detectable on the day or week you test, you haven't gained anything.02:56
wolfspraulyes I understand02:57
wolfspraulbut like I said - I am worried that you are stuck trying to make a perfect logical sequence02:58
wpwrakbut i'm all for designing in the pull-ups. i don't see any design risk there. the improved reset circuit should be tested first. but it's almost certainly okay, too.02:58
wolfspraulof course it would be better to first understand the true root cause before fixing it02:58
wolfspraulrather than randomly and blindly improving this and that, and then maybe just being unable to reproduce the bug, instead of having fixed it02:58
wpwrakoh, the thing is just running while i do something else :) i check every now and then whether it has found any new troubles, but that's all02:59
wpwrakthat's the beauty of automated tests :)02:59
wolfspraulunfortunately it feeds your perfectionism well too :-)02:59
wolfspraulsince it's automated, we may as well accumulate a better 'statistical data base', right? :-)03:00
wpwrakexactly :)03:00
wpwrakbut i'll actually be happy with one the does give me the same answer twice in a row. it hasn't done that yet. this is what worries me.03:00
wolfspraultime passes and we have each block of time only once03:01
wolfspraulso there may also be some value in working on this backwards, i.e. applying all fixes, and then spending the same amount of time on trying to reproduce the problem again03:01
wolfsprauleven if we lost the reasoning path somewhere in the middle...03:02
wolfspraulbut ok, let's see :-)03:02
wolfspraulAdam is slowly but surely getting closer to rc4 work03:02
wolfspraulworst case I need to skip over some of your missing statistical data and make decisions based on gut feeling :-)03:02
wpwrakyes, that's a possibility. but then, if external factors can just hide the problem completely, your tests are meaningless if you don't understand the external factors03:02
wolfspraulthe alter ego of great statistics03:02
wolfspraulnot entirely meaningless03:03
wolfspraulsince we also think when we try to reproduce03:03
wpwrakin my current test i have at least the certainty that the thing must fail03:03
wolfspraulsure, all good03:03
wolfspraulI got it03:03
wolfspraulanxiously awaiting the results :_)03:03
wpwraki'll post a summary of what i have so far on the weekend. then you'll see what a mess it is.03:04
wolfspraulI can imagine03:04
wolfspraulthat's why I suggest, maybe we are better off rebooting, replying all planned fixes, and then focusing on reproducing the problem again03:05
wpwraktomorrow i'll try to finally get my labsw bom done. it keeps on jumping from mondays to fridays and then to mondays again (digi-key deadlines)03:05
wolfspraulthere is definitely a serious root bug somewhere there03:05
wolfspraulvery serious03:06
wolfspraulif a unit fails like this for a real user, that's bad03:06
wolfsprauland if anything your whole pile of test data shows that there is something serious somewhere03:06
wpwrakfor rc4, i think it's good to plan to have those changes. i think there's no need to wait for statistical subtleties.03:06
wpwrakwhat the statistics can contribute is a test that confirms that the problem is indeed gone - or not03:07
wolfspraulyou think those planned changes (including gate & 4.4v reset ic) will make it go away even without locking?03:07
wpwrakthat's what i don't know yet :)03:07
wolfspraullet me guess your answer :-)03:07
wolfspraul"I don't know yet"03:07
wpwrakbut i do know that i like these changes. they improve overall design stability03:08
wpwrakoh, and have you considered the possibility of using a different NOR chip ? some have other locking strategies that may provide much better protection03:09
wpwrak... while keeping the other parameters the same, i hope03:09
wpwrakthe one we have is just a particularly bad fit. many others just come out of reset locked. with one of these, we may never even have known we had the bug ;-)03:10
wolfspraulwhich one specifically do you have in mind?03:13
wolfsprauland if we change - can we keep one binary to support both?03:13
wolfspraulfiddling with this sounds risky but I do want to make the very best rc4 we can, and we may just need to take some risks here and there03:15
wolfspraulhere's the list http://www.micron.com/partscatalog.html?categoryPath=products/nor_flash/parallel_nor_flash03:15
wolfspraulnow we have JS28F256J3F105A03:15
wolfspraul"JS28F256J3F105A 256Mb Production x8/x16 2.7V-3.6V TSOP 56-pin 105ns Yes Uniform -40C to +85C Embedded J3 Tray"03:16
wolfspraulyes I think that's the current one03:16
wpwrakmaybe one from the M28W series. these don't have persistent locks03:17
wpwrakinstead, there's a "soft" block lock, by default, it is set. you can remove it and set it again as many times as you like.03:18
wpwrakplus, there's a "lock-down" that can be made one-way per session (session = time between resets)03:18
wpwrakso, for example, the boot code or even standby could lock-down things we really really never want to change03:19
wpwrakthe code for writing would differ a bit between 28F and 28W. but that could be an isolated change.03:20
wpwrakfor all i know, 28W may even be cheaper ;-) let's see ...03:20
wpwrakhmm, size may be an issue ... at least at digi-key. let's see ...03:22
wolfspraulthat switch sounds like a lot of trouble03:24
wolfspraul"differ a bit" "isolated change"03:24
wolfspraulthat sounds like we will struggle to get this right for 2 years03:24
wpwrakwhy the sudden pessimism ? :)03:25
wolfsprauljust realistic03:25
wolfspraulthose 'small details' are nasty03:26
wolfspraulplus we seem to be on track to making our current chip very robust03:26
wolfspraulthen you would wonder what you actually get with a switch03:26
wolfspraulcheaper is nice, bigger. you may even start the serial flash discussion again :-)03:27
wolfspraulthen maybe it's better to just focus on making our current chip work better (fixing bugs), and otherwise leave things as they are03:27
wpwraksomething like this guy perhaps http://www.micron.com/products/ProductDetails.html?product=products/nor_flash/parallel_nor_flash/M29W256GL70N6F03:31
wpwrakwould need to compare the data sheet bit by bit, though. flash is tricky :)03:31
wpwrakoh, i'm all for serial flash. thanks for mentioning it ;-)03:31
wpwrakif all the smarts were on the uSD card, we wouldn't have this flash corruption discussion :) instead, we'd just recommend carrying a backup03:32
wolfspraulnow that again :-)03:33
wolfspraulfrom my experience, WHATEVER path you choose you will run into difficult problems03:33
wolfspraulthe proof is not in how much of a genius you are in choosing the right path, but what you do once you hit the first difficult obstacle03:33
wolfspraulso there is no way I will jump to the "serial flash" promised land now03:34
wolfspraulor even the "M28W" promised land03:34
wolfspraulif there is a better nor chip, we should compare. but my #1 question would be whether the switch causes a need for software changes, and our ability to provide one set of binaries for all m1 boards03:35
wolfspraulthat needs to be weighed against the pros of the chip03:35
wolfspraulmaybe better to focus on making the current one rock solid03:36
wolfspraulany new chip would have new nasty surprises, guaranteed03:36
wpwraksw change: yes. same binary: yes.03:36
wpwrakbut yes, there's always a design risk03:36
kristianpaulwolfspraul: but because namuru and milkymist uses different clocks, i'm having some problems reading memory content from namuru, so Artyom suguested i switch all the whole soc to use milkymist soc, so basically i get rid off cores i dont need and focus namuru 05:39
kristianpaulwolfspraul: yes, better documentation about how to connect stuff to milkymist is very important too indeed05:39
kristianpaulas when Artyom asked me about milkymist and how could be helpfull for him..05:50
kristianpaulalso yes i hope stickers and other stuff arive soon, workshop is already pointed here too http://www.comunlab.cc/05:57
kristianpaulas liure, but thats me :)05:57
kristianpaulwill be something small, more focused on play with the M1, video in music and some patches, hoping i can guide and people also will get something more elaborate,06:08
kristianpauland in the night i'll see how setup M1 somweher  to ambience music performance06:08
wolfspraulnice that sounds good!06:26
wolfspraulI'm not sure whether the stickers arrive in time, but it's moving06:26
wolfspraulwe need to know about events in advance06:26
wolfspraulcannot always fix bad planning with wasting money on overnight couriers of cheap little stickers :-)06:27
kristianpaulsure not :)06:29
johnnyhahDoes someone familiar with llhdl?07:33
wolfsprauljohnnyhah: not really yet, I guess there is only 1 person on earth right now truly 'familiar' with it and that's Sebastien (nick lekernel)07:34
johnnyhahthx,but how can i contact sebastien(or lekernel)?07:38
wolfsprauljohnnyhah: his primary channel is #milkymist on freenode, and/or also here in #qi-hardware07:48
wolfsprauljohnnyhah: so you are pretty close, just wait in a few hours he should get up and respond07:51
wolfspraulwhat brings you to qi & llhdl ?07:51
wolfspraulor milkymist07:51
wolfspraulcan you tell us a bit more about yourself?07:51
johnnyhah i am a graduate and want to know how can convert netlist to verilog or vhdl07:57
wolfspraulwhich school?08:02
B_LizzardI've been away for long, trying to catch up.14:54
B_Lizzardlarsc, what's the latest stable kernel?14:56
B_LizzardAlso, is suspend working OK now?14:56
xiangfuB_Lizzard, we are plan to use linux 3.0 on next nanonote release. :)15:14
kristianpaulB_Lizzard: 3.0 is last stable i remenber16:34
kristianpaulsuspend still same i bet :)16:34
B_LizzardHmmm, I'll have to look into it.16:34
whitequarkwpwrak: can you take a look at a synchro issue? http://imgur.com/a/YMP9d16:40
whitequarknot sure what may be the cause of it16:40
whitequarkthe shifted lines are consistent and stay at the same place16:44
whitequarkmay it be a PLL mislock?16:44
wpwrakyou should be able to see it on hsync. actually, most scopes have a tv mode that may be useful for this. i've never tried that, though16:53
whitequarktv mode?16:54
wpwrakwhat you certainly can do is trigger on vsync, then walk through the hsync. a bit messy, but should work16:54
wpwraktrigger on tv line and such. not sure if it works for vga or only for composite16:54
wpwrakDocScrutinizer: probably knows such things ;-)16:54
whitequarkah yes. afaik it's composite only16:55
DocScrutinizererr, the scpes I know all have a separate trigger input (optional)20:51
DocScrutinizerwhitequark: what I see is a one-pixel-off issue it seems. Might be caused by driving the display via a digital interface with pixel clock, vsyn, hsync, the pixel clock not in phase with hsync, and some noise on on either of both lines as well. The screenshots don't really allow as much amalysis, but to me it seems the one-pixel-off is not constant along one horizontal scanline, i.e. there's no issue in a line on left side of screen 21:03
DocScrutinizerwhile right side same line shows that offset (or other way round). If that'S actually correct, then it's clearly noise on pixel clock and pixel clock not in phase with edges on data lines, so display is randomly picking one time the "old" pixel data while next time picking the "new" (later) pixel data21:03
DocScrutinizerplacing sth like a 10pF..1nF load capacitor to gnd on pixelclock should create visible massive changes in the effect (you quite frequently see such timing-fixer capacitors on video [and other] clock lines)21:07
DocScrutinizeron pixel clocks that are operating always on rising (or always on falling) edge, this effect commonly is caused by inverted polarity of clock signal: active edge is meant to occur exactly in the middle of the data-steady time window, while inactive edge can occur roughly around the time when level transitions on data lines for new pixel data take place. If active and inactive edge are swapped due to some inverter in the clock line, 21:12
DocScrutinizeryou'll find the display pick old or new data from data lines on a random basis21:12
DocScrutinizerpropagation delay differences between clock and data lines of of ~1/2 pixel time period will have similar effect21:13
DocScrutinizeroops I forgot to take into account your point that the lines with offset are always same place. Might indicate the noise on clock line is from crosstalk from video RAM addr lines. Or your video RAM is actually defect, such a internal short between cells is a quite common failure pattern21:18
DocScrutinizerthe manga picture is a poor test pattern to really investigate it. You should use Red-BlacK / Blue-BlacK / Green-BlacK chessboard patterns and vertical line patterns, also vert.line patterns of several clearly distinguishable colors and 1 pixel line width21:23
DocScrutinizerthen mark the error spots on screen (with a marker pen, maybe on transparent foil attached to screen with sticky), and see if they stay permanent errors no matter what's the displayed image, and maybe even try to change timing of whole video a little and see if the spots move or vanish. If they don't then it's most likely a defect video RAM21:27
Action: DocScrutinizer remembers the funny effects on video output he seen on devices that had some short between two addr lines of video RAM21:30
DocScrutinizera short between addr and data line (or a capacitor mixing both when they are multiplexed on same bus) is sometimes even more funny21:31
wpwrakDocScrutinizer: ah yes, inverted polarity would also be a candidate. do you know that we had this once at openmoko ? and EE had been complaining to the LCD maker forever because of the lousy yield ;-))22:43
DocScrutinizermust've been pre-joerg times23:20
DocScrutinizer(vert.line patterns of several clearly distinguishable colors) like 9 colors, one line of each, then repeating the pattern. Don't use 2^n number of colors23:26
DocScrutinizeras odds are the commonly seen "mirroring" as a symptom of video RAM defects tends to skew with a natural base-2 magnitude23:28
DocScrutinizerso if you use 8 colors, then repeat, you might not notice anything odd even when video ram copies a whole set of 8 pixels to next 8 pixel field23:29
DocScrutinizeror to any aligned 8 pixels23:30
DocScrutinizerI.E. with a 8 line pattern test image you won't see problems on any addr lines other than A0, A1, A223:31
wpwrak(pre-joerg) yeah, that didn't happen on your watch :) it was a software problem anyway, but of course, hardware got the beating23:43
--- Sat Oct 22 201100:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!