#milkymist IRC log for Sunday, 2012-10-21

sh4rm4this channel is shrinking...02:16
wolfspraulsh4rm4: in which way? what is your expectation?02:26
wolfspraulgood morning btw :-)02:26
sh4rm4well, it was 40+ a while ago02:26
wolfspraulI joined ##fpga recently and after seeing an average of 1 'shit' per line I realized that is not for me and left :-)02:27
wolfspraulI will also become more picky on which channels I am in and what I consider quality etc...02:27
wolfspraulwhat more can and should we expect from milkymist?02:27
wolfspraulafaik there is no roadmap or planned releases right now, with some date or so - I think...02:28
sh4rm4*shrug* it looks as if the activity got quite low here02:28
sh4rm4lekernel talks rarely02:28
sh4rm4and ppl leave02:28
wolfspraulyes sure, that's why I am asking about your expectation :-)02:28
wolfspraulmine is pretty close to zero, so I'm not disappointed :-)02:28
wolfspraulI don't think we see another milkymist-obsolete update02:28
wolfspraulor should expect one02:28
wolfsprauland I am not sure about the path of -ng, if there is one02:29
wolfspraulwhat is your expectation?02:29
sh4rm4you think the the -ng caused a split ?02:30
sh4rm4*that the02:30
wolfspraulhard to say02:30
wolfspraulthere hasn't been an -ng release afaik02:30
wolfspraulit seems the project reached a ceiling and it's difficult to advance, in any direction02:31
wolfspraullekernel needs to set that direction and lead, if he wants to02:31
wolfspraulI think it's tough now. the path to higher memory bandwidth is very difficult on m1 or a new board02:32
sh4rm4the milkymist-ng works looks pretty promising though02:32
wolfspraulnew board = many new difficulties02:32
wolfspraulyes sure :-)02:32
wolfspraulprayers are typically quiet, no? :-)02:32
wolfspraulso let's quietly continue praying :-)02:33
wolfspraulI think if you (or anybody) would clearly write what more they expect from milkymist, that would help02:33
wolfspraulunfortunately I cannot contribute that much really helpful stuff there, because I expect very little :-) I know how difficult it is to advance in the different directions...02:34
sh4rm4i am only a spectator here, so my opinion is not important...02:34
wolfspraulmaybe I'm not the only one and pretty much nobody expects anything02:34
wolfsprauloh, very important02:34
wolfspraulbecause eventually the number of 'spectators' goes down, until it reaches 002:35
wolfspraulyou are at least one human being...02:35
wolfspraulalive :-)02:35
wolfsprauland there's maybe 20 here? less actually reading?02:35
sh4rm4but yes, i would expect some more discussion, and lekernel being around more often02:35
wolfspraullekernel can override me, but I think milkymist has reached a ceiling of technical difficulties02:35
wolfspraulso you can pick out some pieces and advance them, but maybe others will fall behind then02:36
wolfspraul(other pieces)02:36
wolfspraulI think it's maxed out, and it has achieved a lot imho02:36
wolfspraulthe next attempt, in whatever direction it goes, will most likely only continue with some part of the last rounds02:37
sh4rm4ah. so you think it's at a dead point now ?02:37
wolfspraulbut what's the goal?02:37
wolfspraulthe old goal was to replace everything, the entire digital center/hub/cpu of a 'computer'02:37
wolfsprauland there were releases and they went to a certain point02:38
wolfspraulnow how to continue?02:38
wolfspraulthere are few people left so pretty much lekernel will set that direction02:38
wolfspraulI am continuing on a fpga tool but it's only very loosely if at all connected to milkymist02:38
wolfsprauldo you have an m1? what's the next thing you want to do with milkymist?02:39
wolfspraulwhy are you 'spectating' here? :-)02:39
sh4rm4i dont have one, now02:39
wolfspraulI think I will build a new experimental case for mine soon02:39
sh4rm4maybe i would have bought one if the price was about 200-250 usd02:40
wolfspraulwhat would you have done with it then?02:40
sh4rm4i did buy a C-one some time ago, for 300 eur just to see that it doesnt work with my screen and to shelf it subsequently....02:41
sh4rm4to experiment around with lm32 cpu and embedded system software02:41
wolfspraulwhat is the C-one ?02:41
sh4rm4commodore one reconfigurable pc02:41
wolfspraulI think > 50% of the m1 ended up in an experience similar to yours :-)02:42
wolfspraulI mean yours with the C-102:42
wolfspraulat least not a single buyer asked to return his unit, which is very generous as they all understand the huge effort we took to pull off the project in the first place02:42
wolfspraulI think we will see some remix of milkymist tech in some shape or form02:43
wolfspraulbut when and where I have no idea02:43
sh4rm4however i think with the price tag you lost a huge market of hackers02:43
wolfspraulI think it will be a much more focused subset of functionality02:44
wolfsprauloh m1 almost sold out, I only have a few units left and the price will stay there until a m1 lover comes along02:44
sh4rm4software ppl get beagle boards, r-pi's and similar stuff to mess around now02:44
wolfspraulthe price could also be 4999 USD02:44
wolfspraulsure, why not02:44
wolfspraullego sold about 2 billion USD of toys last year, I think02:45
wolfspraulthat must be several hundred times more than all beagles and raspberries and arduinos combined02:45
wolfsprauland I mean this in a good way, those are all good projects02:45
wolfspraulI couldn't find much in a beagle or raspberry, but that's just me. I also wouldn't buy that c-one, but checking the tech now :-)02:46
wolfspraulI bought a bunch of tp-link wr703n and they are great02:46
wolfspraul<20usd little wifi computers/dongles02:46
wolfsprauloh nice that c-one project seems a really long-time thing02:47
sh4rm4yeah it has all kinds of cpu cores, can emulate c64, amiga, zx ...02:48
sh4rm4unfortunately only with a multisync screen02:48
sh4rm415 KHz oslt02:48
wolfspraulwhich fpga is on the fpga extender?02:49
sh4rm4one of the faster altera's iirc02:49
sh4rm4virtex or so02:49
wolfspraulvirtex is a xilinx brand name I think02:50
wolfspraulI read quartus somewhere on their site, must be an altera chip02:50
wolfsprauland you bought one of these - nice! congratulations02:50
wolfsprauleven though you couldn't find much use, it's good that you support a pioneering effort such as that one :-)02:51
wolfspraulin my opinion...02:51
wolfspraulso let's wait if/when/what lekernel cooks up the next big thing, huh? :-)02:51
wolfspraulmilkymist has some nice things that deserve to move forward02:52
wolfspraulwow, nice pic02:52
wolfspraulyes that board looks like A LOT of trouble and work waiting :-)02:53
wolfspraulI wouldn't touch something like that anymore02:53
wolfspraulit looks as if they are re-creating an old pc-style motherboard but with 2 fpgas02:54
wolfspraulhow did this project fare overall?02:55
wolfspraullet's see what wikipedia says :-)02:55
wolfspraulnice stuff02:57
wolfsprauloh I think one thing milkymist could do is some totally new angle of interaction02:57
wolfspraulsomething with rf or analog processing02:57
wolfspraulpick one problem and make a really compelling solution02:57
wolfspraulI think it has the power to do that, if someone picks something like that and pulls it off02:58
sh4rm4i think milkymist's nice is more about open source enthusiasts02:58
wolfspraulok but what's the next big thing then?02:59
wolfspraula random toy for a lonely weekend?02:59
sh4rm4well, the milkymist-ng concept makes that really easy03:00
wolfspraulgreat ;-)03:00
wolfspraulsebastien will be happy to hear that :-)03:00
sh4rm4one could even imagine a gui where you can en/disable components by toggling a checkbox...03:00
sh4rm4and produce a custom bitstream03:01
sh4rm4like ethernet support03:01
sh4rm4or usb03:01
wolfspraulwhat for?03:01
wolfspraulyou just want a box to play with and learn stuff?03:01
sh4rm4and ease usage of fpga related tech03:02
wolfspraulI don't see fpgas as that one totally amazing/different tech. they are just another type of chip like dozens/hundreds of other types.03:02
wolfspraulso the main question is still what you build in the end, imho03:03
sh4rm4it's programmable, and thus interesting to software folks03:03
wolfspraulfor fpgas there is a constant stream of devel boards that try to expose the more interesting functionalities of new chips03:05
wolfspraulbut in the end most people don't want to keep learning forever, and spend their entire life learning, but instead they learn something and then eventually start to do some real stuff :-)03:06
wolfspraulthat's why you see them playing (learning) with something for a while, and then they go away, because they learned enough and either switch to something else, or use what they learned in some productive way03:07
sh4rm4dealing with fpga devel boards is hard without a hw background, migen could lower that barrier considerably03:07
sh4rm4and then there's the open source aspect as well03:08
wolfspraulmigen is a fantastic tool I'm sure, but I highly doubt it will lower any needed background03:10
wolfspraulin fact I think the opposite - you will most likely be able to make good use of migen if you already have excellent background knowledge about logic design, fpgas, etc.03:10
wolfspraulthen it will make you even more powerful/productive03:10
wolfspraulbut those steep first steps, I think they remain03:11
wolfsprauljust look - the project is out now for how long? and how many people have found an entry to anything with it since then?03:11
wolfspraulzero, afaik03:11
wolfspraulthat doesnt' look like it makes the first step to anything easier, which I also think you should not expect03:12
wolfspraulit gives you more power if you are already good with fpgas03:12
wolfsprauland logic design03:12
wolfspraulsh4rm4: have you started playing with migen?03:12
wolfspraulwhy not?03:13
sh4rm41) i am busy with my linux distro03:13
sh4rm42) dont know python03:13
wolfspraulyou say "migen could lower the barrier to fpga considerably". what unlocks the 'could'?03:13
sh4rm43) don't understand migen properly03:14
wolfspraulyou wait for a release? better docs?03:14
wolfspraulI know fpgas a little now, due to my fpgatools work03:14
sh4rm4if migen would be used under the hood03:14
wolfspraullet's say 10% of what sebastien knows03:14
wolfsprauland knowing those 10%, I am telling you "don't expect migen to lower your barriers"03:14
wolfspraullet's see what I think when I know 50% in a few years :-)03:14
wolfspraulunder the hood of what?03:15
sh4rm4i.e. i could supply a config file: build lm32 with mmu, plus ram controller, plus ethernet03:15
sh4rm4and migen does all the gory routing stuff etc in the bg03:15
wolfspraulI think I understand what you mean, but I also think you should not wait for/expect that03:16
wolfspraulit just won't happen and work the way you think it should - ever03:16
wolfspraulmany more people need to join and make incremental improvements to a lot of tools. and then collectively maybe they can advance things towards what you have in mind.03:17
wolfspraulI do think that is happening, but it's slow and spread across a lot of tools03:17
wolfspraulfor example I want to learn more about some ic tools, magic/toped/qucs/etc03:17
wolfspraulbut that will take years...03:17
wolfspraulwhat do you mean with "under the hood"?03:18
wolfspraulthat config file example?03:18
wolfspraulI think you should not expect this for some years03:19
wolfspraulnot just because it's "hard" "difficult" or so, but simply because it may not make that much sense03:19
wolfspraulonly my opinion here... my famous 2c03:20
wolfspraulI'm sure tomorrow we see a big release that will make all this as easy as a swipe on your phone...03:20
sh4rm4i'm pretty sure it'll go into that direction, because it's more efficient and saves time and money03:21
wolfspraulwe should work on a milkymist-obsolete release that includes the mmu03:29
wolfsprauleither that or an -ng release with mmu, not sure what is the best path to make the mmu more accessible03:30
wolfspraulthat's a 'should' as in "I hope"...03:30
sh4rm4is the mmu usable at this point ?03:30
wolfspraulnot as in "I will do"03:30
wolfspraulit probably gets closer, sure03:30
wolfspraulchicken & egg03:30
wolfspraulif nobody starts using it and finds more bugs and helps bring the whole thing up, then it will just be there and everybody wonders "is it working?"03:31
wolfspraulbtw quick fpgatools update03:35
wolfspraulin case anyone cares. i ran into big delays03:35
wolfspraulone month ago I set a deadline of "1 month" to go from the unclocked AND design to a clocked counter03:35
wolfspraulthat time has almost passed03:35
wolfspraulbut... codes increased by 5000 lines or more yet there is still tons more missing until all the clocking stuff is working03:35
wolfsprauland that's without dcm or pll, just plain gclk inputs and bufg etc.03:36
wolfspraulso I think I need another month for the clocked counter03:36
wolfspraulextending my dealine to +1 month :-)03:36
wolfspraulend of update03:36
wpwrak(c-one) funny they didn't include any usb, considering that they don't seem to be shy about having tons of interfaces07:49
wpwrak(migen) i agree with wolfgang. i don't see it as a tool to lower the entrance barrier. i see it more as a tool that makes building complex things less of a hassle. but when you're at the point where you actually need such a tool, then you're probably quite good at the basics already.07:50
larscwolfspraul: luckily for us you wont get fired if you miss a deadline07:50
wpwrakkinda like a power drill. makes it easy to put holes into things but you still need to know where to put the holes.07:51
larscand which size you want07:51
wpwraklarsc: a nominal one month deadline means four years anyway :) (add one, multiply by two, then convert to the next higher unit)07:52
wpwrakjust think of all these "oh, i can do that in a weekend" projects that invariably end up consuming something like a month full-time :)07:53
larscor even years07:53
wpwrakyeah, once you get in the habit of missing deadlines, eternity is the limit :)07:54
wpwrakwolfspraul: a clocked counter with enough logic to drive a 7 segment display may be nice to show things.07:56
larscyou keep pushing your expectations, a few weeks ago you just wanted a blinking LED. Have you ever though about a carerr in management? ;)07:59
wpwrakyou mean carrot management ? :)08:01
larscno ;)08:01
wpwrakin this sense: http://apps.tenpearls.com/iimages/pf/ldonkey_il.jpg08:04
larscI know08:05
larscBut I really just meant like a real life manager. Comes to you one week wants feature set X, you implement it, present it and he complains that feature set Y and Z are not yet implemented08:06
wpwrakah, the complaining type. they're not fun. better to just let your underlings feel your poorly masked disappointment that they haven't exceeded expectations without you pushing.08:12
wolfspraullarsc: come on, shed a tear for the poor managers. maybe you misunderstood the importance (or existance) or features X Y and Z in the first place08:17
wolfspraulthat can happen much more easily than one thinks - are you *sure* you really listen all the time, listen well? or you just pick out the things you want to hear or expect to hear from the stream of words reaching your eyes (or characters reaching your eyes)08:18
wolfspraulover time maybe you will recognize that as a skill in and of itself08:18
wolfspraulsome people are just freaking good listeners, and others are not :-)08:18
wolfspraulwelcome to the real world...08:18
wolfspraulI meant: reaching your ears (the words) :-)08:19
wolfspraulmy deadline slipped, but ok. yes I set it myself, just for fun08:19
wolfspraulin fact now that I missed it anyway, I will go back and work on some more wires first ;-)08:19
wpwrakyou still have a week. you can make it :)08:20
larscluckily my managers don't care what I do, as long as I do it ;) at least kind of08:20
larscwolfspraul: btw. fpga tools is just for writing bitstreams or can it also load bitstreams?08:21
wolfspraulboth, but...08:22
wolfspraulreading a bitstream is a separate problem, if it should be meaningful08:22
wolfspraulbecause there are ambiguities and the more meaning you try to extract from the bits the more code and logic you need just for that08:22
wolfsprauland I will not work on those things08:22
wolfspraulso I can read and write both textual and binary bitstreams, but the reading of binary bitstreams is only for the purpose of roundtrip testing in the autotester08:23
wolfspraulnot for the purpose of going from a binary bitstream to verilog (for example), in the end08:23
larscah ok. Where did you find all the information on the bitstream format?08:23
wolfspraulbut on the writing side, I do want to and have to add higher and higher level means to write stuff, including verilog backend (one day)08:23
wolfspraulthere is tons of documentation available, just putting pieces together and trying to verify a complete model which helps focus the whole thing08:24
wolfspraulif 2+2 does not come out as 4, something is wrong08:25
wpwraksounds easy ;-)08:25
wolfsprauland the math that says that 2+2 is/must be 4 is there (in the form of documentation and sources)08:25
Fallenouhey wolfspraul :)12:27
Fallenouhow are you? what's up ?12:27
wolfspraulhacking on fpgatools :-)12:28
wolfspraultoday actually I took a break12:28
wolfspraulbut other than that, next is... not sure. :-) maybe more east-west directional wires :-)12:29
Action: Fallenou reading the backlogs12:30
wolfspraulmusings about the next great milkymist :-)12:31
Fallenouhehe I just read that12:42
FallenouI am making progresses on the MMU, in the direction of "full feature" state (which does not mean "full debuged" state)12:47
FallenouBut there seem to be some modification that must be done on already working things12:47
Fallenoufor exceptions in the cpu12:47
Fallenoubut to answer sh4rm4 question, mmu is working today, at least the virtual to physical translations and the tlb miss features12:51
FallenouI modified Milkymist BIOS to be able to handle the mmu, map pages and use them etc12:52
Fallenouwith exception/tlb miss handlers12:52
Fallenoulike a mini OS12:52
Fallenouand it runs12:52
Fallenouit does not mean that there is no bug in some corner cases12:52
Fallenoubut my tests pass :)12:52
Fallenouso indeed as wolfspraul said it could be nice to try to use it somehow and then give a feedback on bugs which I could fix etc12:52
FallenouI should update and provide a better documentation12:53
Fallenoubecause you would have to either be in my head or read my code to understand how to use it12:53
Fallenoubut the good news is that there is already a software using the mmu as an example :) the modified bios12:54
Fallenouso it can very well be used as an example on how to use the mmu12:54
FallenouI don't know if a parallel mmu-lm32-linux work is something that someone would be interested in12:55
Fallenouthat could be a motivating booster :)12:56
mwalleFallenou: there might be some complications with the exceptions in the m stage12:56
Fallenouyes I read you about that :/12:57
Fallenouindeed if by nature exceptions in m stage are "imprecise" as stated in the lm32 datasheet ... it might be hard to implement DTLB miss and permission faults :/12:57
Fallenouso far I didn't encounter any trouble with it, but maybe it will come ...12:58
mwallewell looking at the code, i'm not really sure if the m stage or the wb data bus error causes the imprecisement12:58
FallenouI guess there might be some troubles depending on if there is a jump/branch or interlock while there is an exception in m stage13:00
FallenouI really don't know yet what they have in mind when they state "imprecisions"13:00
mwalleand of course there might be an exception in the x and in the m stage at the same time13:00
Fallenouwell in that case there is a priority to set up13:01
FallenouI would say priority to the m stage exception13:01
mwalleyou have to make sure every instrucion after the one causing the exception is executed and every instruction before (and the one which caused the exception) is killed13:01
mwalle= precise exceptions13:01
mwallethere is already such a priority in the lm32 code and that is the reason why i dont know where the imprecise databuserror exeception comes from..13:04
Fallenouand the pipeline stall/kill signaling is quite messy13:04
Fallenoureading the code really gives headaches13:04
mwallemaybe it has something to do the the shifter/divider/multiplier which is multistaged13:04
Fallenouoh, maybe13:05
mwallelooking at the block diagram in the manual the x stage and the w stage is the only common stage for all pipes13:05
Fallenouwhat do you mean by "common stage"?13:06
mwallebtw i think we should figure this out before any serious programs are running, eg. find defects, write tests etc13:07
mwalleFallenou: eg the shift pipe has no m stage13:07
Fallenouwell it goes in the m stage anyway but does nothing in it, right ?13:07
Fallenouevery instruction passes through every pipeline stage afaik13:07
mwallebut only one is active :)13:08
mwallefor a given instructon13:08
Fallenounot sure I understand what you mean by all this13:09
mwallehave a look at figure 3 in the refman13:10
Fallenouyes I'm on it13:10
Fallenouonly load and store instructions do something in m stage afaik13:10
Fallenoubut even an add X,Y,Z would lose 1 cycle in m stage I think13:10
mwallebut a shift uses both the X and M cycle13:11
mwalleimho you could call it S1 and S2 then13:11
Fallenouoh you mean the shift uses the M cycle to compute things, but not for writting/reading main memory ?13:12
Fallenoulike an extended X stage13:12
FallenouX stage spans over M stage ?13:12
FallenouI never thought about that !13:13
mwalleif i read that block diagram correctly shift uses two cycles13:13
FallenouI thought that if an instruction would need 2 cycles to execute, it would stay in X stage for those 2 cycles and "stall" the pipeline13:13
Fallenoubut maybe then I'm wrong :)13:13
Fallenouthe diagram seems to tell I'm wrong13:14
mwallethe pipe should only be stalled for RAW hazards13:14
Fallenouwell it's stalled for cache misses as well13:14
Fallenouduring wishbone accesses13:15
mwalle(well there is some other multiply unit in the lm32 code, which might stall the pipe)13:15
Fallenoupipelined divide and multiply should stall the pipeline as well I think13:15
mwalleok and cache misses and mispredicted branches :)13:15
Fallenoumispredict branches I'm not quite sure, it just kills instruction fetched and changes the address stage13:16
Fallenouthe pipeline implementation is really not easy to read ...13:17
Fallenouusually I prefer experimenting and looking at wires and pc_a pc_f pc_d pc_x pc_m pc_w values13:17
Fallenouin the simulator13:17
mwalleFallenou: but actually all signals are named accorind to the stage they are used in :)13:18
Fallenouyes it's a good thing!13:18
Fallenouit helps a lot13:18
mwalleeg the adder gets input *_x and output is result_x, eg its combinatorial13:18
mwallethe shifter has *_x inputs and _m result13:18
Fallenouyes, but the difficult part is understanding the kill/stall things13:19
mwalleso the shifter uses the x and the m stage13:19
Fallenoufor a few it's understandable, but for some of these signals it's like an entire page of || and &&13:19
mwalleyeah and the ifdefs makes it even worse ;)13:19
Fallenouoh yes13:19
wpwrakifdef, the essence of job security13:20
Fallenoufor instance : stall_d13:21
mwallewpwrak: lol :b13:21
Fallenouor stall_m line 189613:21
wpwraki lost a bit track of the issue you're discussing ... is it that DTLB faults in M while ITLB faults in F ?13:22
Fallenouwell the main issue is to make sure exceptions in M stage are "accurate"13:22
Fallenouwhich the LM32 datasheet says "it is not"13:23
FallenouI don't exactly know what it means ... but we generate DTLB miss/faults at m stage so it's bad luck :(13:23
mwallewpwrak: the lm32 commit point seems to be in the x stage13:23
wpwrakah, so you're trying to determine under what conditions whey wouldn't be inaccurate ?13:23
Fallenoufor now, with my examples, it seem to work, but indeed I don't do multiply/divide or complex jump/branch combinaisons in my samples13:24
Action: mwalle is asking himself why they haven't put it into the m stage13:24
mwalleFallenou: and of course interrupts :)13:24
wpwrakmwalle: perhaps because most things get detected at D or X ?13:25
mwallewpwrak: yeah which is bad for us ;)13:26
mwalleand maybe the right way would be to move it to the m stage.. but i fear we mess up the whole processor ;)13:26
Fallenouthe only exception we know which is firing at M stage is data bus error exception13:26
mwallebecause we dont have any test benches for it13:27
Fallenoubut is it really working correctly ? ...13:27
wpwrakwhat causes a DataBusError ? misalignment, what else ?13:27
mwalleFallenou: well and that exception is described as imprecise and lattice doesnt care about because its handled like a fatal error, eg. there should _never_ be such an exception13:28
Fallenouwpwrak: any wishbone error13:29
Fallenouwait a sec13:29
Fallenou        if ((D_ERR_I == `TRUE) && (D_CYC_O == `TRUE))13:29
Fallenou            data_bus_error_seen <= `TRUE;13:29
Fallenouin an always (posedge clk)13:30
wpwrakso what are D_ERR_I and D_CYC_O ? :)13:30
Fallenouit's the wishbone wires13:30
mwalleFallenou: thats btw the code which could delay the exception, because this only sets the seen flag13:30
mwalle+ is13:30
FallenouI am not a wishbone expert but I think cyc_o asserted means "a transaction is active"13:31
Fallenouand err_i means the slave asserts an error line and says to the master "stoooop the transaction, i'm in a bad state !"13:31
Fallenoumwalle: yes but then assign data_bus_error_exception = data_bus_error_seen == `TRUE;13:32
mwalleor there is no target which the address belongs13:32
Fallenouso it does not seem to be delayed13:32
wpwrakwhat i mean is basically: could the "imprecise" come from the detection being "late" ? then it may not be a problem in the case of the TLB, assuming that the TLB has a constant response time13:32
Fallenouit fires right away13:32
mwalleFallenou: there is some stall_x check iirc13:32
Fallenouthe exception fires right away, but then it is not taken right away, there are a few checks13:32
Fallenouwpwrak: oh ok maybe they mean it depends on when the slave asserts the err line ?13:33
mwallewpwrak: imprecise doesn't only mean one or more instructions too late, but also that all instruction before are killed and all later are executed13:34
wpwrakmwalle: should be easy to test if there's a problem with that13:35
mwalleand i fear we may violate that in some circumstances13:35
mwallewpwrak: yeah of course, but then you have to actually understand the pipe :)13:35
mwalleto write the test cases :)13:35
FallenouI don't know what could be my next step ... I am thinking about: doing a few simulations on a non-modified (non-mmu) lm32 core and see how exceptions work in the different stages (f/x/m)13:36
mwalleand when you understand the pipe, you could just say its plain wrong what we are doing ;)13:36
mwalleFallenou: simulate branches right before/after an exception13:36
Fallenouunderstanding the whole pipeline signaling stuff is really a nightmare13:36
mwallesimulate a break, simulate interrupts13:36
mwallesimulate shifts13:37
Fallenouyou just filled my monthly calendar :p13:37
mwalleglad to hear ;)13:37
wpwrakset register N to zero, addi rN, rN, 1; addi rN, rN, 2; addi rN, rN, 4; <code that triggers an exception>; addi rN, rN, 8; ...13:38
wpwrakthen see what you got13:38
mwallewpwrak: but thats only a (basic) test13:40
mwalleand i guess Fallenou already did sth like that13:40
mwalleFallenou: another idea is to actually move the exception into the x stage13:41
mwallebut then you would have to have every signal ready in that stage13:41
mwalleand i guess the adder is used to compute sth like sw (r1+10), r213:42
mwallebut maybe we can duplicate that in the F stage...13:43
mwallei dunno13:43
Fallenou15:41 < mwalle> Fallenou: another idea is to actually move the exception into the x stage < But how? for DTLB stuff we only know that there will be a miss in M stage unfortunately :/13:47
mwalleFallenou: how do you know when there will be a miss?13:47
FallenouI know that there is a miss in parallel of the cache lookup in M stage13:48
mwallelooking at the address?13:48
Fallenoubecause I do the TLB lookup in parallel of the cache lookup13:48
mwalleFallenou: and can this be done a stage earlier? what signals would you need?13:48
mwalleeg. if you would only need the address, and the address is a result of the x stage, we could look if we can get the address already in the f stage13:49
mwalledecode stage13:50
Fallenouin theory we get everything we need in the decode stage13:51
Fallenouwe just need the virtual address being accessed, and if it's a write13:51
mwallemhhh.. but the address is r1+IMM, and we only have IMM in the decode stage..13:52
Fallenouoh right13:52
Fallenouwe need to tkae the displacement into account13:52
mwalleto get r1 we would have to (at least) extend bypass network..13:53
Fallenouthat seems pretty intrusive change13:53
Fallenouto touch the register file ...13:53
mwallewell, do we need this in the D or X stage?!13:54
Fallenouif we want to do the tlb lookup earlier, we need the virtual address earlier, at least at D stage, so that when can do the lookup in X stage and fire the exception earlier13:57
mwallewhich isn't feasible14:02
mwalleFallenou: btw what happens in case of an exception? all previous stages are killed, the M and W stage is allowed to be completed14:09
wpwrakyou could precalculate all possible outcomes and just pick the one that matches. 32 registers, so for each an offset corrector and a vector with 16 possible results. hmm, fpga space may get small with that14:10
mwalleand the current instructions seems to be replaced with sth like mvi ea, EBA+EID14:10
mwallewpwrak: there is also another problem, we only have all register values ready at the beginning of the X stage, so if you want to decide whether there will be an exception in the X stage, this would have be a combinatorial path14:12
wpwrakyou mean from the previous operation ? hmm yes, that would complicate things14:13
Fallenouit's very very easier to fire the exception wire at M stage :/14:14
FallenouI suggest we test if it causes problems14:14
wpwraklet's see what could go wrong if the exception in M is handled just like in X ... competing of preceding instructions shouldn't be an issue14:16
Fallenoulet see what goes wrong and if we can fix that14:18
wpwraksomething could crawl in to W that shouldn't go there. that would be the faulting memory operation. so there would have to be a test for that. should also be easy to write a test case of it.14:18
wpwrakthen you could get a DTLB miss together with an ITLB miss. i suppose that's something that could happen even if the DTLB exception was in X. should be safe to simply ignore the ITLB miss in this case - it'll come back when retrying after exception handling.14:19
Fallenouyes, I think we should just give more priority to DTLB exceptions14:20
Fallenouit makes no sens to serve the ITLB, if we are not sure that it will really need to be fetched anyway14:22
Fallenouif for instance the dtlb exception leads to a change in the program low (seg fault?)14:22
Fallenouprogram flow*14:22
mwalleso what youre beasically trying to archive is to move the commit point to the M stage :)14:33
Fallenouthe commit point ?14:34
mwallethe point where you know that nothing can change anymore14:34
mwallewpwrak: what happens if someone kills the M stage?14:37
wpwrakhow would someone kill the M stage ?14:38
mwallebecause theres a branch?14:38
mwalleFallenou: do you understand that kill_* and valid_* and q_* logic?14:46
mwallebranch_flushX_m is nice ;)14:47
mwalle(the comb logic)=14:47
mwallenot the signal name14:47
mwallewpwrak: Fallenou: do you know gem5?14:52
wpwrakfirst time i hear of it14:58
wpwrakisn't the M stage already too late to kill anything ?15:00
Fallenougem5? no, what is it?15:05
Fallenoumwalle: about kill_*/valid_* q_* I just understand that "kill" means the instruction in the stage will do nothing, valid it must be something similar, and q_ is to "qualify" a signal, to add an extra layer of verification15:07
Fallenouso I just have a very basic understanding15:08
Fallenouusually I just follow the lines of code to see what signal conditions what action15:08
Fallenouusually it's a mix of kill == false and valid == true and q == true15:09
wpwrakmaybe try to convert that code into a drawing. looking at it from a different angle often helps to understand things.15:09
wpwrakplus, it'll also give the rest of us some insights ;-)15:10
Fallenouhum hum ok15:19
FallenouI will try to do a drawing with the result of simulation15:19
Fallenousounds nice gem515:22
mwallewpwrak: i guess the store is done at the end of the stage, eg the x stage can still kill sth in m15:40
mwalleFallenou: unfortunately it doesnt support lm32 ;)15:41
mwalleFallenou: you use some proprietary simulation tool, right?15:41
mwallebtw maybe someone will try rtlvision pro on lm32 ;)15:42
Fallenoumwalle: I am using ISim which is free to use (there is a maximum number of lines of verilog, fortunately lm32 is small enough)15:46
Fallenouit's non open source, but it's free (they do not charge you) to use, it comes with ISE Webpack :)15:46
FallenouI have a git repository all set up for anyone who would want to simulate lm32 with ISim btw15:47
Fallenouyou just git clone, you source ISE settings.sh, you type "make" and there you go, it runs the simulation (with console for $display() and waveforms)15:48
Fallenoubut then indeed you would need to compile some code, so have the toolchain etc etc15:49
Fallenouor I could provide you with a binary file15:49
mwalleFallenou: is there a console version?16:22
GitHub74[milkymist-mmu-simulation] fallen pushed 2 new commits to master: http://git.io/7Jv5PA16:24
GitHub74[milkymist-mmu-simulation/master] fixes in ITLB - Yann Sionneau16:24
GitHub74[milkymist-mmu-simulation/master] Synchronise the simulation repository with the mmu-lm32 repository - Yann Sionneau16:24
Fallenoumaybe ISim can run in console16:24
Fallenoubut there is not much things to click on in ISim :) Just a "start" button !16:24
Fallenouoh yes I think there is a console mode16:25
Fallenouinstead of doing ./soc -gui you just do ./soc :)16:25
Fallenouand then you type run16:25
Fallenouit seems ISim can generate .vcd file (to be viewed on gtkwave)16:55
FallenouI don't get how you can enable the console output (for $displays) while in ISim console16:55
Fallenouok you run ./soc and then do "restart" "init" and finally "run all" and it will run the simulation and print console output :)17:12
Fallenouno gui needed !17:12
Fallenouyou can add breakpoints, show value of signals, set values (overwrite them)17:22
mwallewpwrak: btw http://pastebin.com/eNVi9z9n works like a charm :)17:24
mwallemh i need to save ba, too17:24
mwallecan i unset a debugger variable?17:30
Fallenouit's debugger commands that works using the milkymist jtag cable ?17:31
mwalleFallenou: why do you import the sources for the processor in the simulation repo? can't they be outside?17:32
mwalleFallenou: yep :)17:32
Fallenouwell yes they could ^^17:33
FallenouI should do a git submodule17:33
Fallenouit's not the best organization ever, I agree17:34
Fallenouthe best would be to specify your lm32 folder in the simulation repository to be imported17:35
mwallewith the binary which should be executed :)17:35
mwallesoc.v emulates the WB?17:37
FallenouI yes and uart and uses sram as wishbone slave to hold code17:37
Fallenouit's just a "dummy" toplevel, instanciating lm32/wishbone/sram/fake uart based on $displays17:38
mwallei c17:40
Fallenou19:36 < mwalle> with the binary which should be executed :) < I modify the binary a lot while simulating, so I often recompile the bios and then copy it to simulation folder and relaunch the simulation17:41
mwalleiverilog didnt work, right?17:41
FallenouI remember having tried a few simulators and the best solution I found so far was ISim17:42
Fallenouiverilog was my first try, I don't remember why I stopped using it17:43
FallenouI can only remember an issue with log2 function17:43
Fallenoudo you want a bios.bin with an itlb test ?17:50
Fallenou(or directly a ram.data)17:51
mwalleFallenou: i'm planning to begin with some basic instructions18:01
mwalleusing the original lm32 core18:01
Fallenouok :)18:05
Fallenouyou can compile a simple .S file, then take the binary, name it "bios.bin" and move it in the simulation repo, use the h2g tool to generate ram.data (or just do make ram.data)18:06
Fallenouh2a tool*18:06
mwallebtw agreen (atgreen?) solved the verilog readmemh quite nicely: http://moxielogic.org/blog/?p=18018:16
mwallelm32-elf-objcopy -O verilog bios.elf bios.vh :)18:17
Fallenouwow very nice :)18:17
Fallenouthat would remove the need for h2a18:18
mwallebut you need byte memory18:18
Fallenouit would need to be parametrizable18:20
Fallenouand just generate the ascii file instead of verilog code18:20
Fallenouoh, I misunderstood, it generates only ascii file18:23
Fallenoubut I mean we should be able to set the number of bytes per line18:24
Fallenouit seems in the example we can see on his blog that it's one byte per line18:24
mwallejust use for bytes for one word18:27
wpwrakmwalle: load usb firmware ? now you lost me18:27
mwallewpwrak: last time i asked for some helper functions in gdb :)18:27
mwallethe task was to load an usb firmware binary through gdb18:27
mwalleand the helper program needed to convert from binary to 'usb 16bit memory accessed through 32bit writes'18:28
mwalleif (frag_partial_adr == 32'h11000C00) << is this the uart base address the bios uses?18:29
wpwrakaah, i see. so that's what you wanted it for :)18:30
mwallewpwrak: so i can develop a new firmware on my pc and upload it automatically18:30
wpwrakcute :)18:34
Fallenoumwalle: yes18:34
Fallenoudivided by 418:35
Fallenouthe real address is 0x44003000 iirc18:36
Fallenoubbl eating :)18:42
Fallenoumwalle: in my linker file I am using a MEMORY {} containing only 1 element : sram, with ORIGIN = 0x4400000018:44
Fallenouto have the same kind of addresses Milkymist One board uses for SDRAM18:45
Action: Fallenou is back20:03
FallenouI condigured the lm32 cpu to boot at address 0x4400000020:07
Action: Fallenou is trying to remember why he does a logical AND with 32'h2EFFFFFF to address sram wishbone slave20:26
Fallenouat least the result is that 0x2E and 0x11 == 0x0020:35
mwalleFallenou: just pushed lm32-simulation in my milkymist repository22:01
mwalleit adds cores/lm32/test22:02
mwalleit does basically the same as your lm32 simulation repo22:03
mwalleunfortunately it doesnt work (yet) ;)22:03
wpwrakmwalle: regarding killing M ... wouldn't incorrect branch prediction be detected in X, and thus affect F and D, but not the later stages ?22:03
mwalleah and it uses iverilog22:03
FallenouI used migen to generate soc.v22:03
mwallewpwrak: if a branch is predicted wrong, there are the wrong instructions in the pipe, which has to be killed22:04
mwalle'later' in the pipe22:04
Fallenouyou wrote the testbench yourself ?22:04
mwalleFallenou: yep22:04
Fallenounice :)22:04
mwallewell with some input from yours ;)22:05
mwalleor the generated file22:05
mwallebut frankly i dont like the long signal names22:05
wpwrakmwalle: yes, but they ought to be "after" the branch, not "before". so are you sure about incorrect branch prediction affecting M ?22:05
Fallenouit's the problem of using code generators :)22:06
Fallenouyou end up with strange names22:06
mwallewpwrak: and after means?22:06
Fallenoubut I am very glad I didn't have to write that piece of code (soc.v)22:06
mwalleFallenou: how does isim handle the non initialized ram (used in cache?)22:07
Fallenouram is initialized22:07
Fallenouisim handle quite badly non initialized things22:07
mwalleor where is it initialized?22:08
Fallenounon initialized value is "X" which behaves very badly in subsequent logic operations22:08
wpwrakassuming that the prediction error is found in X, that would be F and D (i.e., no harm done, since they don't have side-effects)22:08
wpwrakso you'd get only a delay22:08
wpwrak(side-effects) well, besides a potential ITLB miss. but that's a separate issue :)22:08
Fallenoulook at line 13022:08
Fallenouhttps://github.com/fallen/milkymist-mmu-simulation/blob/master/lm32_dp_ram.v and line 2922:09
Fallenouand at the bottom of lm32_cpu.v I replaced reg0.mem by reg0.ram (idem for reg1): https://github.com/mwalle/milkymist/blob/lm32-simulation/cores/lm32/rtl/lm32_cpu.v22:18
mwallewpwrak: yeah the first kill is likely the decode stage22:19
mwallebut it'll move forward in the pipe22:19
mwalleFallenou: ah you actually modified the sources, didnt see that22:19
wpwrakah, you mean a "zombie" walking through the pipeline. okay, but that one won't cause us grief - especially no DTLB miss.22:20
Fallenoumwalle: yes I took that liberty :p22:22
mwallewpwrak: maybe, what i tried to say is, we should now put some thoughts into the exception handling, figure out what can happen, why lattice chose the x stage for exceptions, what happens if the exception is raised in the m stage, etc22:23
mwallebut for that we really have to understand the pipe22:23
mwalleso the first question for me is, how an exception actually works22:25
mwallesomehow the PC ends up in EA, and a jump is done22:26
Fallenouto get that, the best I think is to simulate lm32 and look at the wires22:26
mwallei guess a pipeline visualizer would be handy too ;)22:26
Fallenoubasically you assert exception_x, and when exception_m == TRUE you can be sure that the exception will actually happen and you can deassert exception_x22:27
wpwrakmwalle: yeah, a visualization of the pipeline would be good. that's what i suggested to Fallenou.22:28
mwalleFallenou: what instruction causes the writeback to EA?22:28
mwallethe one causing the exception?22:28
FallenouI don't know22:28
Fallenoumaybe the one in X stage, dunno22:29
Fallenou(the one which was in X stage I mean)22:29
Fallenoubut I really don't know22:29
wpwrakFallenou: not sure if signals are the right level of abstraction, though. i'd think more of a list of conditions at each stage, followed by either the content of the next stage(s) or arrows indicating an immediate effect22:29
mwallewpwrak: i was never good at painting ;)22:29
Fallenouneither do I :p22:30
mwallewpwrak: but you may extract the needed informations from the signals22:30
wpwrake.g., D = shift would be followed by a two stages long block for X and M. F = ITLB miss would be followed by a flush and then exception, etc.22:30
wpwraksure, the signals have it all. but i think they're still too low-level. they basically tell you how it's done, but not why.22:31
Fallenouand they are too numerous to have all the conditions in head22:31
wpwraki;m basically looking for the "pipeline designer's view"22:31
Fallenouat least I cannot get verything in my head22:31
wpwrakthat's why i'm suggesting a drawing :)22:31
Fallenouyes it's a good idea but it may take a while to implement such a visualizer22:32
wpwrakif it gets too complex, draw it. that forces you to make things consistent. each time you find something that doesn't fit or that you can't explain, you've discovered a hole in your knowledge you may not have been aware of before.22:32
wpwrakno no, not a visualizer. just a diagram :)22:33
Fallenouwell to be honest I draw a few complete sheets of pipelines :p22:33
mwallehave a look at http://www.cs.washington.edu/education/courses/cse378/07au/lectures/L11-Pipelined-Datapath-And.pdf22:33
wpwrakalright, multiple diagrams then ;-)22:33
Fallenouyes I basically did that mwalle22:33
Fallenoubut with a longer program22:34
Fallenoulike 20/25 lines like that22:34
mwalleso if we get sth like that _automatically_ from a simulation output22:34
Fallenouthat would be cool :)22:34
wpwrakmwalle: that's a good start. now add the things connecting the little boxes, and the different possible outcomes :)22:35
Fallenouto add a bit of code to lm32 to just generate data about pipeline, and then process the data with a graphic tool22:35
wpwrakthe simulation output isn't what defines how things work. the source is.22:35
mwallewpwrak: like page 12?22:36
wpwrakoh, there are multiple pages ;-) checking ...22:36
mwallewpwrak: well its a first step to understand the pipeline22:37
wpwrakpage 12 is basically figure 3 from the lm32 archman22:37
mwallebecause you can easily write some small program and visualize that22:37
mwallewell its a classical 5 stage pipe ;)22:38
wpwrakpages 21-29 are better, but probably have too much detail22:38
mwallewpwrak: all pages are missing the control logic22:39
wpwrakyup. i'm more thinking of something like page 1 but with control logic and different choices.22:39
wpwrakso, not visualization of a single run but the design for all runs. or, if it gets too complex, one drawing per major case.22:41
FallenouI think the first step would be to visualize which instruction is in which stage, maybe with a few important wire values. Just as a first step22:41
Fallenouand then maybe show a lot more condition value22:41
mwallehave to go to bed now ;)22:41
Fallenouat least showing all kill/valid/q values for each stage22:41
Fallenouthe same for me!22:42
Fallenouvery nice to have this brainstorming with you guys :)22:42
Fallenouthanks !22:42
Fallenousee you!22:42
wpwrakpipelined dreams ! :)22:43
Fallenouwpwrak: so not a visualization of a particular run with a particular program running ?22:43
Fallenou(pipelined dream) ahah :)22:43
Fallenouso a static image ? not an animation ?22:43
wpwrakyeah, static. something you can pin on the wall and look at when you get lost.22:44
wpwrakif you don't trust your code analysis you can still pick things from the simulation. but you probably don't need a visualizer just for checking.22:45
FallenouI have trouble visualizing in my head the page 1 with control logic22:46
Fallenouand even if there is no program running :o22:46
Fallenoueven more*22:47
Fallenouhow can you do a drawing like page 1 but in a "generic" way, without any actual assembly instructions22:47
wpwrak(i'm thinking here of my experience with the linux (network) traffic control infrastructure. that one seemed impenetrably complex. i spent something like two months trying to figure it out. eventually, i succeeded to the point of being able to write non-trivial code for it. then i wrote a paper, which helped me to find more inconsistencies in my understanding. a few weeks later, suddenly extensions from other people started to pop up ;-)22:47
FallenouI guess network stack of linux is a bit of a mess like a cpu pipeline :)22:48
FallenouI really agree that a drawing would help a lot not being lost22:48
FallenouI just have troubles picturing what you have in head22:49
wpwrakyou could have multiple choices per stage. e.g., "memory read", "memory write", "shift", "multiply", "anything else". so just stack them vertically. you basically get a tree.22:49
Fallenouby control logic, you mean kill_*/valid_*/q_* ?22:49
wpwrak(linux networking) here's the paper: www.almesberger.net/cv/papers/tcio8.pdf22:50
wpwrak(logic) yes, at each stage, you could indicate the active signals that go out to other stages.22:51
Fallenouthat would be hell of a wall poster22:52
Fallenoua tree of pipeline lines22:52
Fallenouwouldn't it be too big?22:53
wpwrakmaybe ;-) if it gets too complex, break it down into scenarios. e.g., "normal case", "normal shift", "shift with itlb miss", and so on22:54
Fallenouok I see22:54
Fallenoudo you have a technology in mind for such drawings?22:57
wpwrakyou mean a drawing program ? well, i use xfig for almost everything, but it's not to everyone's taste.22:59
wpwraka more modern choice would be inkscape23:00
FallenouI used dia but it's not very easy to use23:00
FallenouI used another tool to generate DAG or trees with text files , I don't remember the name of this command line tool23:00
Fallenouhttp://en.wikipedia.org/wiki/DOT_language < this one23:01
wpwrakGraphviz ?23:03
wpwrakyeah, the classic :)23:03
Fallenoudunno if it's good for this task though23:03
wpwrakmaybe. i'd just do the drawing manually. gives me more control :)23:04
wpwraksee also http://en.wikipedia.org/wiki/DOT_language#Limitations :)23:05
Fallenouyes I remember having this kind of problems23:05
FallenouI wanted a specific layout with subgraphs ...23:05
FallenouI ended up using invisible arrows to have the layout I wanted ...23:06
wpwrakmaybe consider inkscape. you may need it anyway for "polishing" things23:07
wpwraki remember that, a long time ago, tgif looked promising as well. it still seems to be around. may be a bit more compatible with modern taste than xfig.23:08
wpwrakwhat's nice about xfig is that the text format is fairly simple. so you can write nice post-processors. e.g., for drawings with conditional elements.23:09
wpwraknot the xml madness that's all the rage nowadays23:10
Fallenouah good!23:11
Fallenoutime for me to go have pipelined dreams as well!23:11
wpwrakstall-free pipelining then ! :)23:12
Fallenouhéhé thanks, interruption-free dreams!23:12
--- Mon Oct 22 201200:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!