#milkymist IRC log for Saturday, 2011-08-20

aw0x32: 1. stoped at 'Bitstream length: 1484404' while flashing. 2. plugged out and replugged jtag board then can reflashing...but d2/d3 dimly lit after flashed 3. resoldering u9 flash chip still the same. 4. replaced new u7/u19/u20 5. applied fix2b circuit 6.  stopped @ 'Bitstream length: 1484404' while reflashing. 6. tp36 - 690mV, tp37 – 793mV 7. D16(in-circuit): For.V. = 153mV, Rev.V. = 1426mV down to 836mV(not constant) 8. t04:11
awp36/tp37 - random pulses (stable 0.76V to random pulse to 2.2V) 9. removed C238, tp36/tp37 pull high correctly at 3.29V 10. replaced a new C238, D16(in-circuit): For.V. = 158mV, Rev.V. = 1545mV 11. reflashed successfully 12. board auto boot (d2 is ON) and Monitor screen stops at Milkymist One logo after powered on and shutdown intermittencely without pressing middle btn, not everytime can be reproducible; even can't reconfigure(04:11
awtp36/tp37 level not pull high enough) sometimes after power-cycle. 13. D16(in-circuit): For.V. = 152mV, Rev.V. = 758mV which continually dropping(not constant) 14. took reset ic apart: measure impedance(over 20 Mega ohm) between Out and Gnd; same as new part(unmounted), 8.3 M ohm between Vcc and Out; same as new part(umounted). Bad that didn't meausred impedance TP36 to gnd 15. check other good board, impedance TP36 to gnd: 10.204:11
aw3 k ohm, 18.12 k ohm @TP37 to gnd 16. solder a new D16 back 17. 10.05k ohm @TP37 to gnd, 17.48k ohm @TP36 to gnd 18. oberseving/ at least 1 minute on TP36 with good 3.3V after power on(d2/d3 is fully off). keeping probering then pulsing up and d2/d3 is dimly lit. TP36/TP37 Voltage down to stable 2.4V.04:11
awsurely can't readback from flash, stoped @'Bitstream length: 1484404'04:19
awif tp36/tp37 stays at 3.3V, dump can be done, i think.04:21
aw0x3c: readbacking...04:27
aw0x3c: dump - http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/bitstream/0x3c-standby1.bit/04:45
aw0x32: readbacking......hope it can be dumped smoothly04:49
aw0x32: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/bitstream/0x32-standby1.bit/05:07
wolfspraulaw: I think the results yesterday looked good, no?05:14
wolfspraulin the bottom line. So I see no reason why we cannot proceed with fix2b for all 90 boards.05:15
wolfspraulof course it's good to study 0x32/0x3C a little bit.05:15
awyes, i felt fix2b was good yesterday...although still have unstable/intermittence pulse at TP36 which we don't know yet.05:17
awso next Monday I think that I could keep rework fix2b on others firstly05:17
awbut meanwhile if werner have any new idea want me to study those 0x32/0x3c/0x77, I can be interrupted firstly for a while05:18
awso reworks via fix2b will be continued next week05:19
wolfspraulyes I think that's a good plan05:19
awi'm dumping 0x77 now for records firstly05:19
wolfsprauldefinitely - clean records, very important05:20
awlater I'll all update in wiki notes05:20
awfinish 0x77 dump..I'll go out...so we continue next week, okay?05:20
awat least now we knew 'writting' to NOR had have a one word zeroed out on 0x48, and read actually is quite reliable from usb-jtag board this we knew it. Regards to 'writing': we don't know yet is urjtag or bitstream problem though. ;-)05:24
aw0x77: http://downloads.qi-hardware.com/hardware/milkymist_one/production/rc3/test_results/bitstream/0x77-standby1.bit/05:28
awwolfspraul, from 0x48 histories with one word zeroed out, can we say the 0x85 CRC failed then pass i met by using test program is reasonable?05:33
wolfspraulyes probably good [0x85]05:34
wolfspraulbut we keep it in a special 'hold' condition same as 0x4805:34
awyup...i also think 0x85 is good, but now i didn't put it 'avail -fix2b' for sure05:35
awthat could be just probability rate though it happened while power down...well..now don't know yet.05:36
awalright, the wiki ods file is up-to-date.05:37
wolfspraulwell. if power-down really corrupts the nor, we have another big rework in the pipeline.05:41
wolfspraulafter reading the 0x85 notes more carefully, I don't see a big problem like with 0x48 because the crc error showed up immediately after the test program ran, if I understand the notes correctly06:19
wolfspraulthen all 10 rendering cycles passed without incident06:20
wolfspraulwe find out more details on Monday, but so far I'm not worried. no indication that we may need the 4.4v reset ic rework...06:20
wpwrak__good morning ! :) lemme catch up ...06:45
wpwrak__(0x85) was there a reflash between the bad and the good CRC check ? or did just rebooting "fix" it ?06:51
wpwrak let's use this window ... fewer underscores in my nick :)06:52
wpwrakthe iliad seems short compared to the epic journey of 0x32 ;-)06:53
wolfspraulI'm not 100% sure about 0x85 either and Adam went into his well deserved weekend06:55
wolfspraulbut in either way I think the notes indicate the crc problem was very early, before rendering cycles06:55
wolfspraulit may even be the exact same software bug as 0x48 (guess in both cases, no evidence)06:55
wolfspraulbut the important thing is - we have no evidence of power-down related nor corruption06:56
wolfsprauland we have no boards that fail after the first render cycle06:56
wolfspraulthat's what I'm watching for. so I think Monday Adam continues with fix2b across the entire batch, and we follow the results.06:57
wolfspraulafter 30 or so are 100% good I think we can stop and assemble/package them06:57
wpwrakwhat i wonder if the NOR in 0c85 apparently changed without rewriting it. in 0x48, a reflash made the error go away, which is consistent with a write having gone awry for some reason (sw bug, stray corruption)06:57
wpwrakif 0x85 changed spontaneously, it's more like old friend 0x3a06:58
wolfspraullike I said many times. as long as the 100% pass boards are all stable, we can look at these things later.06:59
wolfspraulthe main question is whether something like 0x34 can ever fail again :-)06:59
wolfspraulif it can, we have a problem06:59
wolfspraulif it cannot, it doesn't matter that there are some trouble-makers in the run, even if they cause more design improvements06:59
wpwraki'll think of some nice torture tests for 0x34 :)07:00
wolfspraulso as long as 0x34, 0x39, 0x40, ... are stable, all is fine07:01
wolfsprauland I am watching for evidence that they may not be stable, in other words watching for evidence that our test process is leaky07:01
wolfspraulbut so far in hundreds of render cycles, not one bit of such evidence has come up07:02
wolfspraulthey all either lead to 'cleanly failing' or to 'cleanly passing' boards, very binary07:02
wpwrakwhat bothers me with things like 0x3a is that i still can't point a finger to a specific location. e.g., FPGA content, FPGA I/O, bus, NOR I/O, NOR memory cells, maybe power, maybe reset. we have seen only high-level effects, not the isolated phenomenon (with the understanding that there are limits to how far we can get)07:03
wolfspraulunderstood. but you also understand my logic. I am not aiming for perfection. I am aiming for 100% pass boards that I can sell and support with a straight face.07:03
wpwrakwe have a few that aren't quite clean, possibly including 0x8507:03
wolfspraulwe see on Monday. I think it never made it to the first rendering.07:04
wpwrakyes, i understand. what i'm saying is that for me, these odd boards may still point to a real reliability problem07:04
wolfsprauloh sure07:04
wolfspraulbut that's a yield improvement07:04
wolfspraulfor this run or future runs or derived products07:05
wpwraknot, even a real problem wouldn't have to be devastating, but i'd at least like to know it a bit better. e.g., to be able to predict when or how it might strike07:05
wolfspraulbut the customer who buys a 100% pass rc3 board now will never notice the improvement07:05
wpwrake.g., if all the M1 will fail after 20 hours of operation above 30 C, that wouldn't be so good :)07:05
wpwraki also wouldn't aim for "no board left behind". but at least i'd like to know what to put on the epitaph :)07:07
wolfspraullike I say. two different problems.07:07
wolfsprauland raising the bar of the test process, even design verification, to higher levels, is yet another separate problem07:07
wolfspraulI am aiming for 100% understanding of all 90 boards, definitely07:08
wolfspraulbut realistically we will write some off 'unexplained'07:08
wolfspraulfor economic reasons, and because at some point the history including reworks introduces too many unknowns/variables07:08
wolfspraulbetter to stop and try again in the next run07:08
wpwrakyes, the risk of working adam too hard must be considered07:09
wpwrakbut i think, so far, he enjoys the occasional break from bulk rework :)07:10
wolfspraulI think Adam will test and fix rc2 for another 1-2 months.07:11
wolfspraulsorry rc307:11
wolfspraulwe see, too early to estimate now07:11
wolfsprauleven though there were some bumps, we have to do a proper job with all boards07:12
wolfspraulthe bumps were great and helped us to understand and improve a lot of things07:12
wpwraklooking at the results ... 0x32: wow, i'm surprised he even got a dump. that NOR is certainly a mess :)07:12
wpwrakah, wait .. picked the wrong one07:13
wpwrakno, 0x32 looks more orderly07:13
wpwrakten 0->1 errors in bit 707:13
wpwrakall in the address range (hex) 7660-82c007:14
wpwrak0x3c readback was perfect07:16
wpwrakmaybe 0x3c is a good start for further analysis. there, we already know the NOR is good at the moment07:18
wpwrakthe relevant test would be, when the board is in the "messy voltage" state, to inject a limited current towards 3.3 V into TP36 (PROGRAM_B) and TP37 (FLASH_RESET_N)07:19
wpwrakthen see how much is needed to bring it up. (limited current) e.g., 100R from the 3V3 rain in series with an amperemeter07:20
wpwrakif the problem is a short at or near C238, TP36 and TP37 should be roughly the same, and the current may be quite large (> 10 mA)07:22
wpwrakif it's a failure to pull up TP37 alone, due to lack of FPGA pull, the current should be small (microamps) on TP36 and even less - possible even negative - on TP37.07:24
wpwrakif it's a short to ground on FPGA or NOR, TP37 should pull all it can, but TP36 should be zero (as above)07:25
wpwrak(short on FPGA, i mean P22, the one that drives FLASH_RESET_N)07:25
wpwrakif it's just a little on both TP36 and TP37, maybe R30 has an issue07:26
wpwrakso these tests will tell us a lot07:27
wpwrakwhat sets 0x3c (and a few others) apart from the tested and found to be good boards is that they still show suspicious voltages on TP36/3707:28
wpwrakso they may just form a "fix2+fix2b needs more prodding" cluster. but maybe we can spot the origin of that cluster. e.g., a visually detectable soldering problem07:29
wpwrak0x77's NOR is also perfect07:31
wpwrakalso has the voltages problem. there's definitely a cluster.07:32
wpwrakthese are tricky, because they did survive a number of things. such as writing the NOR and reading it back. they also show that the in-circuit test of D16 doesn't catch this issue, so it's somewhere else07:33
wpwrakyou saw the last bits of my ramblings ? up to and including "these are tricky. [...]" ?07:37
wolfspraulI think I got it ... "visually detectable soldering problem"07:39
wolfspraulI'll read the weblog in a bit07:40
wolfspraulI already mentally bookmarked "follow Werner's test plan on Monday" :-)07:40
wolfspraulthe details I will lookup then :-)07:40
wpwrakhehe, good :)07:48
wpwrakin any case, with fix2b and the testing, M1rc3 made a lot of progress. i think that fixed 75% or 80% of the "cluster", didn't it ?07:49
wpwrakmaybe you should brag a bit about it on the list :)07:50
wolfspraulah, I don't know. my todo is exploding everywhere and I need to keep my energy for the launch and marketing.07:55
wolfspraulplus there are still unknowns, I just wait for more data first07:56
wpwrakjust to let people know what's happening and that something is happening. doesn't have to be a novel :)08:03
rohwolfspraul: heh. regret being a pioneer, holding only a knive and a folding shovel, standing in dirty boots somewhere in no-mans-land?08:15
wpwrakroh: there be no regrets ! ;-)08:20
lekernel1-2 months ?????10:55
wolfspraul:-) sorry about that. you took that out of context.11:07
wolfspraulof course everything we do the last weeks is prioritized to bring in the first day of sales.11:07
wolfspraulI meant that I can see Adam continuing on the 'long tail' of rc3 1-2 months _after_ the first day of sales, yes.11:08
wolfsprauland I hope that's a conservative estimate11:08
wolfspraulI hope it's 2 weeks. but I cannot always create more pressure on Adam, he is already working 70+ hours / week for over a month11:08
wolfspraulAdam even told me he takes a 1 week vaction in late September. The first in 2 years :-)11:09
wolfspraulI think he deserves it...11:09
wpwrakdamn. vacation ! we should never have abolished slavery !11:16
wolfspraulwe have a lot of rc4 verification parts in Taipei already. adv7181c, gates, 4.4v reset ics11:21
wolfsprauleverything there11:21
wolfsprauljust hours of the day still limited to 24... :-)11:21
wolfspraulSebastien had a nice idea about the rtc. we can build a little daughterboard for the expansion header, and put a cheap rtc chip + cr2016 battery or so on it11:22
wolfspraulthat could even be retrofitted by rc1/rc2/rc3 users who want it11:22
wpwrakso those ~2 months are what you predict will be the time until rc4 gerber out ?11:24
wolfspraultoo many unknowns now11:25
wpwrak(daughterboard) ah yes, why not. if it works well, you can then merge it into rc5 or so11:25
wpwrakokay, what's your "internal schedule" ? :)11:25
wolfspraulalso we could make a few hundred UBB-style 'base-boards' for people to experiment with11:25
wolfsprauli have no internal schedule, all full power forward11:26
wolfspraulI think it depends on the launch, a lot11:26
wpwrakfull speed ahead in no particular direction ? aw ... :)11:26
wolfspraulwho carries the news? do we have a good shop landing page11:26
wolfsprauloh no11:26
wolfspraulof course towards the world's best video synthesizer11:26
wolfspraulso the variables11:26
wpwrakheh :)11:26
wolfspraul1. speed of sales11:26
wolfspraul2. discoveries on rc3 or rc4 design verification after we start selling rc311:27
wolfspraul3. customer feedback from rc311:27
wolfspraul4. potential larger customers who want to put in a big order11:27
wolfspraulI sales are super slow, and nobody is interested to buy more then we have more time for rc4 :-)11:28
wolfspraulthen I will spend my time marketing the box, even traveling around to potential customers maybe11:28
wolfspraulsystematically, the rc4 run should probably be about twice the size of rc3, so let's say 160 units11:28
wpwrak(slow sales = more time) very good ! :)11:28
wolfspraulbut it could be less or more depending on the unknowns that slowly come in11:29
wolfspraulso how can I answer your question now: not at all11:29
wpwrakhmm, you'll need an assistant for adam then11:29
wolfspraulI should work on a great shop landing page11:29
wpwrakyeah, the shop is important, too11:29
wolfspraulJon had some idease, have to follow up etc and then also do it11:29
wpwraklemme see what milkymist looks like now ...11:30
wpwraksharism -> "Product not found!" :(11:31
wolfspraulit's impossible to say more now11:31
wolfspraulyeah sure ;-)11:31
wolfsprauleven if I think about extreme cases, anything is possible. say a customer shows up who wants to buy 100 right away, and pre-pays 50%. how fast can we ship?11:31
wpwrakyou shouldn't un-list out of stock items. give people something more useful. e.g., send them to a different distributor, tell them when you expect more, or how to get notified11:32
wolfspraulwell, of course if we feel good about rc3, we can also use the exact same gerber and produce that11:32
wolfspraulwhy not11:32
wpwrakat least now we know more or less how to fish the bugs out of rc3 ;-)11:32
wolfspraulit's a bit additional risk, but if a customer wants to order and we feel we can produce, then that's the fastest way11:32
wolfspraulbut this case is unlikely. in that case we could do the next smt as early as 3 weeks later. better not tell adam.11:33
wolfspraulgerber out monday. :-)11:33
wpwrakyes, sure. you wouldn't let a big order slip if you help it11:33
wolfspraulso anyway, way too many unknowns now.11:33
wolfspraulif gerber out is MOnday and it's the same pcb, who knows maybe smt the following Monday :-)11:34
wpwrak"sorry, your vacation just got canceled. and about those 70 hours week, i think you should also put in the weekends" ;-)11:34
wolfspraulthen ready to ship Wednesday?11:34
wolfspraulI hope Adam doesn't read this out of context, everybody will get their shocks...11:34
wolfspraul@adam: WE ARE JOKING!11:34
wolfspraulI like the daughterboard idea11:35
wolfspraulwe should make little breakout boards for people to experiment11:35
wpwrakwhat are the headers ? 0.1" ?11:36
wolfspraulmaybe first just the naked board that exposes the pins to the next level, free for the soldering attack11:36
lekernelwpwrak: 2.54mm yes11:36
lekernelstandard stuff11:36
wpwrakvery good11:36
wpwrakwolfspraul: 0.1" shouldn't really need anything additional11:36
lekernelwhat can make sense is a optoisolated breakout11:37
wpwrakwolfspraul: it's all standard items from there on11:37
wpwraklekernel: yes, something that adds circuit11:37
wolfspraulstandard or not a little board is helpful11:37
wolfspraulotherwise everybody has to do that first11:37
wpwraklekernel: but just plastic, copper, and FR4 ? naw.11:37
wolfsprauland optoisolation is a good idea too11:37
lekernelinexperienced people do all sort of wrong stuff with electronics (inductive loads without flywheel diodes, short circuits, overvoltages, ...) which could easily damage the fpga when connected directly11:38
wpwrakwolfspraul: why do you need the board ? you just get a standard connector, add a ribbon cable, and connect it to whatever you want11:38
wolfspraula board provides space11:38
wolfsprauloptoisolation argument it strong imho11:39
wpwrakwolfspraul: most people who are at a point where they wouldn't want to attach their own stuff to an M1 are probably at the point where they have solved the basic prototype board problem in some way11:39
wpwrakwolfspraul: breadboard, pre-patterned PCB, DIY PCB, there's a lot of options11:40
wpwrakand 80% will probably just connect to their beloved arduino ;)11:41
wpwraklekernel: how's 5V compatibility ? (-:C11:41
wolfspraulthere's a big note on the silkscreen about that :-)11:41
wolfspraulI still think a safe starting point would be great11:41
wpwrakwolfspraul: arduino in circle, with a line across it ? ;-)11:42
wolfspraulsince this will interact with the ic design running in the fpga, there will necessarily be a lot of 'live' development with the chips wired up to the fpga11:42
wolfspraula cheap standard expansion daughterboard may help11:43
wpwraki don't quite see the use case. you already have a perfect standard connector. probably the first thing people learn to connect to :)11:44
wpwrakand a cable that goes out of the M1 to your circuit is much safer than a PCB that hangs somewhere inside, over the rest11:45
wolfspraulok, maybe we document a few approaches, together with digi-key part numbers etc.11:45
wolfspraulthat may provide a similar effect11:45
wpwrakyeah. it's really basic.11:46
wolfspraulthere are 2 headers there, 2*8 and 2*911:46
wolfspraulI never know which is which. I think one is related to vga, the other one goes to the fpga.11:47
wpwrakif you want to make little extra boards, make them do something useful. e.g., galvanic isolation (btw, there's more interesting stuff than just opto. i'll try to dig it out after breakfast)11:47
wpwrakor if you want, an arduino interface board ;-) complete with example code that lets the M1 blink a LED on the arduino and that let the arduino send a "hello world" to the M1, for rendering ;-)11:48
wpwrakmaybe you can get tuxbrain interested :)11:49
lekernelwpwrak, there's midi to talk to arduinos11:50
lekernelno need to open the case, just connect the midi port to the arduino serial pin with a resistor11:50
wolfspraulJ3 is 2*9 and connected to some audio/video codec wires11:50
wpwraklekernel: both ways ?11:51
wolfspraulah no both are 2*911:51
wpwraklekernel: perfect11:51
wolfspraulJ21 goes to the fpga11:52
wolfspraulit says 'not 5V tolerant' but it seems J21 provides both 5V and 3.3V11:53
wpwrakmaybe add a jumper in rc4, to enable 5 V only if you're really sure you know what you're doing  ?11:55
lekernelyes, the 5V pins are here to drive the sync signal of an hypothetical dual screen output11:55
lekernelno, if you touch this connector either know what you are doing or use an optoisolated adapter11:56
wolfspraulI don't see what's wrong with providing some power. people have to read the schematics anyway. :-)11:56
wpwraknothign wrong with providing power. but mistakes happen .. :)11:56
wolfspraulanother thing is how the pins going to the fpga are chosen. I understand they are not all the same. different banks or so? voltage domains? not sure. I think we will hear over time "too bad that wire XXX is not on the expansion header..."11:57
wolfspraulor I hope we hear because it means people build expansions :-)11:58
wpwrakyeah. if everyone is quiet, that doesn't always mean they're happy :)12:00
wpwraklekernel: (isolators) have you seen these critters yet ? http://www.analog.com/static/imported-files/Data_Sheets/ADUM5240_5241_5242.pdf12:21
wpwraklekernel: no optics. and they also provide power :)12:21
wpwraklekernel: also exist with 4 data channels12:21
lekernelyes, I received some AD spam about it already12:38
lekernelthey're pretty cool12:39
wolfspraulwpwrak: what's special about this and what can you build with it?12:46
kristianpaulribbon cable :)13:19
kristianpaulBUT, in my case for example a PCB is nice if can hold something on it and i finally can put that acrylic thing in the top again..13:21
wpwrakwolfspraul: (special) high speed and they also transfer power14:20
wolfspraulwhich applications does that translate to?14:29
wpwrakanything that needs galvanic separation plus a bit of circuit on the other end. e.g., for protocol processing. for example USB to some other serial protocol. one side has a USB device chip, the other some other MCU (or whatever). the USB side powers both.14:40
wpwrak(via this isolater)14:41
kristianpaul[12483.356146] usb 1-3: usbfs: usb_submit_urb returned -12120:02
kristianpaulbad fedora ;)20:06
kristianpaulhum, from what i can see it is related to libusb and kernel indeed20:11
kristianpaulbut seems kinda old issue...20:11
kristianpaulkinda anoying bug indeed20:17
kristianpaulthe cost of try the fancy gnome3, xD20:28
kristianpauleven at full speed the same message20:41
kristianpaulso this is a software problem..20:41
--- Sun Aug 21 201100:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!