#milkymist IRC log for Wednesday, 2013-02-27

larscmore xilinx ranting: They made the XADC threshold interrupts level sensitive, as if I'm able to bring the temperature down again from within the interrupt handler.16:11
qwebirc72236hiiiii16:26
qwebirc72236@name vdmo16:29
larsc /name16:29
qwebirc72236m3 status?16:30
lekernelqwebirc72236, are you the one who just sent me an email?16:39
GitHub110[migen] sbourdeauducq pushed 1 new commit to master: http://git.io/ia8vZQ17:20
GitHub110migen/master c10622f Sebastien Bourdeauducq: fhdl/verilog: insert reset before listing signals17:20
lekernelazonenberg, your cmakefile still generates a makefile when gtkmm is not found22:23
lekernel(and then the build fails)22:23
lekernelfor red tin22:23
azonenberglekernel: Hmm, i'll fix that22:23
azonenbergThe stuff in the google code repo is actually somewhat out of date22:24
azonenbergversion 0.2 is in the repo with my thesis22:24
azonenbergit uses a lot of the same build infrastructure22:24
azonenbergand i havent had the time to split it off and push back22:24
lekernelwhere can I find that new repo?22:25
azonenbergIt's not public at the moment because my VPS provider is oversubscribing me and i dont even have enough capacity for myself22:25
azonenbergi'm in the process of changing hosts22:25
lekernelProject and inventory management system written in PHP [END OF LIFE] << lol22:26
lekerneljust use github22:26
azonenbergi'm still SVN based22:26
lekernelwhy write "module(some_signal); input wire some_signal;" instead of "module(input some_signal);" ?22:33
lekernelbtw I think you can infer SRLC32E and get rid of xilinx-specific code22:33
azonenbergI need detailed control over the primitives in order to do the partial reconfig properly22:33
azonenbergi have to make sure it's exactly one level of logic22:33
azonenberganyway the JTAG stuff (not in that code version) is inherently xilinx specific22:34
lekernelyou can describe the reconfiguration behaviour22:35
lekerneland Xst should pack it into a SRLC32E correctly22:35
azonenbergI've had bad experiences with inference in such cases22:35
azonenbergIn order to get maximum speed i'm pretty sure that device-specific optimizations are necessary22:35
azonenbergi plan to make an altera port at some point too22:35
azonenbergwith the same capabilities22:35
azonenbergand probably re-using a lot of the logic22:36
lekernelif you use the correct code patterns, you should get *exactly* the same SRLC32E22:36
azonenbergi've been able to infer fixed-address shift registers22:36
lekernelplus, it's easier to simulate, and it will work to some extent on non-xilinx fpgas22:36
azonenbergbut not variable-address ones22:36
lekernelaccording to the xilinx libraries guide, the variable address one can be inferred to22:37
lekerneltoo22:37
azonenbergi'll look, i just know that i've had difficulty convincing the tools to do it22:37
azonenbergbear in mind that code that infers properly for xilinx may not infer properly on altera22:37
azonenbergand thus still need vendor-specific tuning22:37
lekernelsee http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_3/xst_v6s6.pdf p. 17222:37
azonenbergAlso, i may not keep that exact SRL structure22:38
azonenbergi'm in the process of experimenting with different options to allow reconfigurability while packing as many trigger controls as possible into one slice22:38
azonenbergi'm looking at options including SRLs, dual-port RAM, etc22:38
azonenbergOh, and i still have to do the run-length encoding22:41
azonenbergthe trigger logic is going to by necessity be quite specialized to the underlying FPGA slice primitives22:41
lekernelby necessity?22:42
azonenbergIf i want to be able to run the LAs as fast as whatever input i throw at it22:42
azonenbergthe LA has to be extremely efficient22:42
lekernelthings like RLE rather sound like higher-level algorithms that would be better implemented with behavioral code22:42
azonenbergThe trick is to switch the trigger block to edge detection (for RLE)22:43
lekernelwell, when done right, synthesized netlists are extremely efficient22:43
lekernelportable22:43
lekerneland faster to write22:43
azonenbergMy current thinking it so squeeze that into the last unused SRL32 primitive22:43
azonenbergthe last input*22:43
lekernelwhat will you gain with that? 0.1ns of timing? 3 LUTs?22:44
azonenbergare you kidding me? more22:44
azonenbergeven in -3 series s6 it's 210ps through one LUT22:45
azonenbergthen routing delays22:45
azonenbergit'll probably be at least 500ps22:45
azonenbergwhich at 200 MHz is a 10% speedup for each LUT I can cut out of the critical path22:45
lekernelS6 is a slowness pig, just use kintex ...22:46
azonenbergMy current platform is s6 based22:46
azonenbergI have an artix7 bord in the works22:46
azonenbergboard*22:46
lekernelso you're planning to optimize each bit of your design like this, by hand?22:47
azonenbergNo, just the LA22:47
lekernelbut the rest won't catch up22:47
azonenbergthe goal is for the LA to be fast enough to keep up with anything i can conceivably throw at it22:47
azonenbergartix7 in -1 spee is 130ps per LUT, lol22:49
azonenbergDouble the LUT performance right there22:49
azonenbergkintex7 is 60ps, wow22:49
lekernelyeah I told you22:49
lekernels6 is slow crap22:49
lekernelazonenberg, so after the trigger condition is met, you get 512 samples at the clock frequency?23:03
azonenberglekernel: Right now, yes23:03
azonenbergThe RLE version will give you 512 edges23:03
azonenbergas in, sample all signals whenever any one changes23:03
azonenbergWhich will let you capture with long delays between bus cycles etc23:04
azonenbergor slow signals like a UART as long as there isnt fast stuff in the same capture23:04
azonenbergIn the future i'll be making it parameterizable depth23:04
azonenbergand possibly with23:04
azonenbergwidth*23:04
azonenbergthese are all in progress for v0.223:04
azonenbergThe code in that repo hasnt been touched in months23:05
lekernelparametrizable depth should be a < 10 line change, no?23:05
azonenbergIt's not that simple because the new version is using my NoC protocol for reading data out over JTAG23:05
azonenbergand packets have a 512-word limit23:05
azonenbergso i'll need to add code to send multiple packets23:05
azonenbergAdding more depth to the actual capture core will be trivial23:07
azonenbergthe wrappers, less so23:07
azonenbergThe UART code in that version is buggy too23:08
azonenbergi need to rewrite it from scratch23:09
larscif the rate of change is low enough do you think continuous streaming can be implemented?23:09
azonenberglarsc: High on my list of things to explore23:09
larsccool23:09
lekerneland I suppose reusing the milkymist uart is not an option, right?23:09
azonenbergfirst thing to do is improve my JTAG code so i can get more than 80 Kbps23:09
azonenberglekernel: the UART itself is fine23:10
azonenbergit's the code that talks to it23:10
azonenbergAnd if its wishbone based then no, not an option23:10
azonenbergas i'm not using wishbone23:10
azonenbergBut my uart is being used in a lot of code and works fine23:11
azonenbergby several people23:11
azonenbergthe binary protocol on the wire is horrible thoguh23:11
lekernelthe new one is any bus standard you feed into the migen csr generator. but I guess you won't like migen either :)23:11
azonenbergpythin :p23:11
azonenbergpython*23:11
azonenbergAnd that's the thing, it's not a bus23:12
azonenbergit's a network23:12
azonenbergthe topology isn't flat23:12
wpwrakheh, 512 samples. and i thought i was doing poorly with my ben-based "LA" and its ~8000 samples :) (of course it's a little slower than the FPGA)23:12
azonenbergwpwrak: this is able to capture signals with data rates of over 200 MHz23:13
lekernelah, it talks to the UART through the NoC?23:13
azonenberglekernel: The old version did not23:13
azonenbergthe new version is a NoC slave23:13
azonenbergyou can have a softcore download trigger conditions and pull data off23:13
azonenbergor, as i'm doing now, use the NoC-to-JTAG debug bridge to control it from a PC23:13
azonenbergOnce i work out a bug in the debug bridge i'll be all set23:15
wpwrakazonenberg: oh, i thought it would be faster. the ben goes up to 56 MHz, maybe even 84 MHz on a good day23:16
azonenbergwpwrak: This is the hgihest i've tested it at23:16
wpwrak(pure capture. no decoding or display)23:16
azonenbergi'm mostly limitd by spartan6 block RAM23:16
azonenbergwhen i move to 7 series i'll be able to go a lot higher23:16
wpwrakseems that lekernel is right about s6 being sluggish :)23:16
azonenbergs6 block ram maxes out at like 300ish MHz23:17
azonenbergi forget the exact number, 320ish?23:17
azonenbergI could potentially go faster by DDRing the RAM bank23:17
azonenberghave two ram blocks with 180 degree phase-shifted clocks23:17
wpwrakthat sounds more like it. plus, you'd double the depth.23:18
azonenbergAnd also double the size of the core23:18
azonenbergWhich is a real concern if you want it to be small23:19
larsclinear growth is ok23:21
larsc;)23:21
azonenbergBut now the minimum size of the core in the smallest provisioned size is bigger23:21
azonenberg512 is a little shallow though so i plan to fix that23:22
azonenbergfirst step is to get rid of this jtag race condition23:22
wpwrak(plan to fix that) sometimes a design just really really wants to go in a certain direction :)23:23
azonenberglol well i prioritize bug fixing over new features23:23
wpwrakhow unprofessional ;-)23:24
larscsometimes you could the impression, that you are quite alone with that in the software world23:24
larscbug fixes don't sell23:24
larscfeatures do23:24
azonenberglol23:24
azonenbergbut i'm not trying to sell23:24
azonenbergi want to solve my own problem :p23:25
azonenbergand that means looking sexy isn't the first priority23:25
azonenbergworking is :p23:25
larscI bet we could convince wpwrak to become your product manager ;)23:25
wpwrakheh ;-)23:28
--- Thu Feb 28 201300:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!