| GitHub187 | [scripts] xiangfu pushed 2 new commits to master: http://git.io/vmKSDw | 06:19 |
|---|---|---|
| GitHub187 | [scripts/master] reflash_m1.sh snapshot: don't flash data by default - Xiangfu Liu | 06:19 |
| GitHub187 | [scripts/master] update the power-on message - Xiangfu Liu | 06:19 |
| qi-bot | The Firmware build was successfull, see images here: http://fidelio.qi-hardware.com/~xiangfu/build-milkymist/milkymist-firmware-11292011-0735/ | 08:19 |
| GitHub134 | [flickernoise] sbourdeauducq pushed 1 new commit to master: http://git.io/JI_FKw | 09:11 |
| GitHub134 | [flickernoise/master] performance: fix unmapped key handling - Sebastien Bourdeauducq | 09:11 |
| azonenberg | lekernel_: Any experience working with multiprocessor softcores? | 09:37 |
| azonenberg | I'm particularly interested in the interconnect | 09:38 |
| azonenberg | in terms of cache coherency and how multiple processors share the bus | 09:39 |
| azonenberg | i'm working on a triple-core SoC from scratch | 09:39 |
| azonenberg | and am designing the interconnect fabric now | 09:39 |
| azonenberg | i'm using a shared bus (only one core can talk at a time, but it's full duplex) | 09:39 |
| lekernel_ | what's at the other end of the shared bus? DRAM? | 09:40 |
| lekernel_ | also, softcores are slow. why not use dedicated accelerators? | 09:40 |
| azonenberg | A fixed-mapping MMU | 09:40 |
| azonenberg | That splits the address bus between memory mapped IO and DDR2 | 09:40 |
| azonenberg | the DDR2 has an L2 cache in front of it | 09:40 |
| lekernel_ | MMU? you mean address decoder? | 09:40 |
| azonenberg | each core has its own dedicated L1 | 09:40 |
| azonenberg | Basically, yes | 09:41 |
| azonenberg | hardwired mapping | 09:41 |
| azonenberg | The L1 is gonig to be structured in such a way as to be a passthrough for the IO address range and cache DRAM and flash addresses | 09:41 |
| azonenberg | then DRAM and flash will have their own SoC-wide L2 caches | 09:41 |
| azonenberg | I know i'm reinventing the wheel a bit, its mostly an educational exercise | 09:42 |
| Action: azonenberg is writing a dissertation on computer architecture soon and wants to sharpen his skills first | 09:42 | |
| azonenberg | But its actually going to be quite fast | 09:42 |
| azonenberg | on spartan6 -2 speed i am shooting for 200 MHz | 09:42 |
| azonenberg | * 2-way superscalar | 09:43 |
| azonenberg | = 800 mflops for 2 cores | 09:43 |
| azonenberg | I had to pipeline the heck out of it, but its looking feasible | 09:44 |
| lekernel_ | until you get timing paths into the bus arbiter? :) | 09:44 |
| azonenberg | Actually, the bus arbiter is looking just fine | 09:46 |
| azonenberg | i just did a standalone test of it at 200 mhz and it works just fine | 09:46 |
| azonenberg | on hardware | 09:46 |
| azonenberg | My solution to this thing is, pipeline it like crazy | 09:47 |
| azonenberg | its a barrel processor | 09:47 |
| lekernel_ | with all the cores and memory controller connected to it? | 09:47 |
| azonenberg | so a 16 stage pipeline means zero latency | 09:47 |
| azonenberg | and 32 stages means one stall | 09:47 |
| azonenberg | i run 16 threads and context switch every clock | 09:47 |
| lekernel_ | ah, i see | 09:47 |
| azonenberg | Right now its looking like when running out of L1 cache with a 16 stage pipeline i will have no stalls | 09:47 |
| azonenberg | despite not having any forwarding whatsoever | 09:47 |
| azonenberg | an L1 cache miss that hits in L2 will most likely stall one instruction | 09:47 |
| azonenberg | if i can fit the L1=>L2 and back in 16 clocks | 09:48 |
| azonenberg | or 2 instructions if it takes me 32 | 09:48 |
| azonenberg | as long as i can keep the entire bus structure pipelined | 09:48 |
| azonenberg | this is a very GPU-esque architecture | 09:48 |
| azonenberg | hiding latency by multithreading | 09:48 |
| lekernel_ | what about cache miss rates when you have 16 threads switching so fast? | 09:48 |
| azonenberg | I envision it being something like CUDA, each thread executing mostly the same instructions | 09:48 |
| azonenberg | But they can branch as they see fit' | 09:48 |
| azonenberg | The entire architecture is mostly an experiment | 09:49 |
| lekernel_ | you should compile dedicated hardware accelerators ... | 09:49 |
| azonenberg | you mean, ASIC level? | 09:49 |
| lekernel_ | adding layers over layers makes things slow | 09:49 |
| azonenberg | Sure, go get me $30K and i'll get it fabbed in MOSIS :p | 09:49 |
| lekernel_ | yes, generate VHDL from CUDA directly | 09:49 |
| azonenberg | and no, this is mostly an educational exercise | 09:49 |
| lekernel_ | no, I mean use the FPGA fabric directly | 09:49 |
| azonenberg | The goal is to see how many flops i can pull out of a softcore CPU | 09:49 |
| azonenberg | running real code | 09:50 |
| lekernel_ | softcores are only good to run housekeeping or legacy software | 09:50 |
| azonenberg | also i have a project in mind that will involve me working with non-hardware people | 09:50 |
| azonenberg | I have dedicated accelerators for stuff like JPEG encoding that i'm working on | 09:50 |
| azonenberg | But the flight control code has to be in C | 09:50 |
| azonenberg | or C++ | 09:50 |
| azonenberg | or assembly | 09:50 |
| azonenberg | since i am working with CS people who dont knowh hardware | 09:51 |
| azonenberg | So i want to design a nice powerful architecture for them to run it on | 09:51 |
| azonenberg | the other motivation as i said is just cutting my teeth on computer architecture | 09:51 |
| azonenberg | this is not something i envision being a softcore forever, but custom ASICs are not cheap | 09:51 |
| azonenberg | if things go well and it works as planned i might try sending it out to mosis eventually | 09:52 |
| azonenberg | i would love to have a laptop running a CPU i designed | 09:52 |
| azonenberg | in 180nm TSMC or something | 09:52 |
| azonenberg | But i'm not that advanced yet :p | 09:52 |
| azonenberg | I read your post about the latticemico32 synthesis lol | 09:53 |
| azonenberg | and i think my processor will be faster | 09:53 |
| azonenberg | But i'd have to reimplement some of the xilinx hard IP cores like the memory controller | 09:54 |
| azonenberg | and their soft FPU | 09:54 |
| azonenberg | I'm pretty sure i can write a better FPU but i havent gotten around to it yet, and as long as it's interface-compatible with theirs it'd be a drop-in replacement | 09:54 |
| lekernel_ | their soft fpu? what's that? | 09:56 |
| lekernel_ | you're using coregen for a fpu? | 09:56 |
| azonenberg | Yes, for now | 09:56 |
| azonenberg | i wanted to focus on the datapath and interconnect first | 09:56 |
| azonenberg | then go and write myself an FPU when i had all of the surrounding stuff done | 09:56 |
| azonenberg | in the meantime i have theirs because it tells me an FPU of that size and speed is possible | 09:56 |
| azonenberg | iow, setting a lower bound | 09:56 |
| azonenberg | then i can try and outperform it with an open one | 09:56 |
| azonenberg | Coregen lets you generate floating point add/sub, multiply, divide, and sqrt units separately | 09:57 |
| azonenberg | So i'll replace them with my own one by one | 09:57 |
| azonenberg | But again the focus for now is on the datapath and microarchitecture more than implementation | 09:58 |
| lekernel_ | you can use the milkymist pfpu pipelines btw ... | 09:59 |
| azonenberg | The goal here is to practice efficient pipelined architecture | 10:00 |
| azonenberg | So i want to use as little premade code as possible | 10:00 |
| azonenberg | like i said i'm doing a thesis on computer architecture soon and i want practice | 10:00 |
| lekernel_ | but you reused the coregen pipelines already :-) | 10:01 |
| azonenberg | Temporarily, so i could build the other stuff around them | 10:01 |
| azonenberg | its not expected to stay | 10:01 |
| azonenberg | if i had used a free one i'd have less incentive to replace it :p | 10:01 |
| lekernel_ | so that's what I get for developing free hardware ... | 10:04 |
| azonenberg | production project? Sure | 10:04 |
| azonenberg | But for educational value sometimes its better to reimplement | 10:04 |
| azonenberg | Once i build mine, i'll compare it to yours and any other open ones i find | 10:05 |
| azonenberg | and use the best one in real projects | 10:05 |
| qi-bot | The Firmware build was successfull, see images here: http://fidelio.qi-hardware.com/~xiangfu/build-milkymist/milkymist-firmware-11292011-1026/ | 11:28 |
| GitHub122 | [flickernoise] sbourdeauducq pushed 5 new commits to master: http://git.io/va42-g | 12:38 |
| GitHub122 | [flickernoise/master] Do not create ramdisk folder - Sebastien Bourdeauducq | 12:38 |
| GitHub122 | [flickernoise/master] filedialog: lock in ssd - Sebastien Bourdeauducq | 12:38 |
| GitHub122 | [flickernoise/master] filedialog: prevent slash in filenames - Sebastien Bourdeauducq | 12:38 |
| GitHub168 | [flickernoise] sbourdeauducq pushed 1 new commit to master: http://git.io/TnE0CQ | 13:52 |
| GitHub168 | [flickernoise/master] shutdown: rename button - Sebastien Bourdeauducq | 13:52 |
| GitHub120 | [flickernoise] sbourdeauducq pushed 1 new commit to master: http://git.io/qJvhUw | 13:59 |
| GitHub120 | [flickernoise/master] png: enable loading of RGBA images - Sebastien Bourdeauducq | 13:59 |
| qi-bot | The Firmware build was successfull, see images here: http://fidelio.qi-hardware.com/~xiangfu/build-milkymist/milkymist-firmware-11292011-1343/ | 14:25 |
| GitHub36 | [flickernoise] sbourdeauducq pushed 1 new commit to master: http://git.io/ND-mFA | 14:30 |
| GitHub36 | [flickernoise/master] New patch - Sebastien Bourdeauducq | 14:30 |
| xiangfu | Hi | 16:11 |
| xiangfu | what is the different between MicroBlaze and LM32. | 16:12 |
| xiangfu | is that same thing in one SOC system. on LM32 is open but MicroBlaze? | 16:12 |
| xiangfu | s/on/only | 16:14 |
| wpwrak | kinda like MIPS vs. ARM. same purpose, different origin, different style, etc. | 16:15 |
| xiangfu | wpwrak, got it. | 16:16 |
| qi-bot | The Firmware build was successfull, see images here: http://fidelio.qi-hardware.com/~xiangfu/build-milkymist/milkymist-firmware-11292011-1535/ | 16:18 |
| lekernel_ | http://www.milkymist.org/flickernoise.html | 16:22 |
| lekernel_ | new screenshots | 16:22 |
| kristianpaul | MMM... :-) | 16:33 |
| kristianpaul | too much zoomed effects i think | 16:34 |
| wpwrak | bah. the end of the year is nearing. fireworks !! :) | 16:38 |
| kristianpaul | :-) | 16:38 |
| kristianpaul | yeah , fireworks are nice | 16:38 |
| lekernel_ | kristianpaul: if you design new patches that look better, there's no reason I would refuse them... | 16:39 |
| Action: wpwrak is amazed by how well USB can work even though he completely misunderstood the handshake between fpga and navre ... | 16:40 | |
| wpwrak | let's see if anything still works after fixing that | 16:40 |
| lekernel_ | ? | 16:40 |
| wpwrak | i thought the SYNC would also set rx_pending ... | 16:41 |
| lekernel_ | no, rx_pending is only set after the first byte is completely received | 16:41 |
| wpwrak | (but i never tried to retrieve it. sometimes, two wrongs make an almost right :) | 16:41 |
| lekernel_ | but it doesn't make much change, does it? | 16:41 |
| lekernel_ | (I mean the first byte of "payload" after the sync, ofc) | 16:42 |
| wpwrak | yeah, just means that my loop was a little late | 16:42 |
| wpwrak | and unnecessarily complicated, too | 16:43 |
| kristianpaul | lekernel: i dont wanted to mean that, i just a comment (from what i like) no rush :-) | 16:44 |
| kristianpaul | and no i dont imaging designing patches soon | 16:45 |
| Alarm | What is the best way to load the latest binary M1.? | 17:38 |
| lekernel | Alarm: as I said, web update | 17:38 |
| lekernel | http://www.xilinx.com/about/customer-innovation/index.htm | 17:39 |
| Alarm | no with the jtag ? | 17:41 |
| lekernel | no, JTAG is for developers | 17:41 |
| lekernel | and generally slower and harder to use than the web update if you just want a release upgrade | 17:42 |
| lekernel | http://www.linux-kvm.org/wiki/images/1/1f/2011-forum-usb.pdf "Remove funky (ab-)use of the usb devices in bluetooth and milkymist." wtf? | 17:43 |
| kristianpaul | lekernel: nice !!!) | 17:48 |
| kristianpaul | "" | 17:48 |
| kristianpaul | Once an application for custom ASIC cores, this demanding computer graphics process is now the province of low-cost FPGAs. | 17:48 |
| Alarm | The problem is to download the latest version I'm using wget but it's not great for a set of files | 17:49 |
| lekernel | the M1 downloads the latest version itself | 17:50 |
| lekernel | just connect it to your internet router ... | 17:50 |
| wpwrak | lekernel: (ab-use) what on earth is that presentation about anyway ? | 18:00 |
| lekernel | USB in QEMU it seems | 18:01 |
| lekernel | but I asked myself the same question for a while ;) | 18:01 |
| wpwrak | ;-)) | 18:01 |
| Alarm | I want to do the update by the jtag for pedagogic reasons. The method "WebUpdate" has no interest for me | 18:01 |
| Alarm | my problem is basic. I am looking for a simple command to download binaries | 18:03 |
| Alarm | "wget-r" aspire all files | 18:04 |
| Action: lekernel is giving orcc a try. of course, hundreds of MB of java bloat to install ... | 18:51 | |
| kristianpaul | some comments from a friend "you can get video switch for 8usd, but mixer.. as minimun do fading from one picture to another" | 21:03 |
| kristianpaul | and please dont be angry with me for posting this, i'm just replying comments | 21:04 |
| lekernel | the M1 isn't a video switch or mixer. the switch functionality is just a little add-on. you can also get an arduino led blinker for $25 which can do the same as the front panel LEDs on the M1... same kind of stupid comparison | 21:05 |
| wpwrak | mixer may be tricky: you need two codecs for that | 21:05 |
| wpwrak | and i'm not sure if the chip we use has multiple codecs inside | 21:05 |
| lekernel | it does not | 21:06 |
| lekernel | M1 was never intended as a video mixer | 21:06 |
| kristianpaul | i'm very exited to bug other friends about M1/FN new features also bring back some feedback | 21:06 |
| kristianpaul | sure not | 21:06 |
| lekernel | the main feature of this software update is image support - and stress that it can be used with MIDI controllers. the rest is secondary. | 21:08 |
| kristianpaul | sure sure | 21:11 |
| kristianpaul | and for you hapiness he really likes the pacman video from wpwrak | 21:12 |
| wpwrak | and one more device enumerates :) | 21:12 |
| wpwrak | hehe ;-) | 21:13 |
| wpwrak | we need a few more images per patch. then we can have real games :) | 21:13 |
| kristianpaul | wee :) | 21:14 |
| wpwrak | C64 retro style :) | 21:14 |
| wpwrak | of course, the LV3 is still mute. that one's a tough cookie | 21:14 |
| wpwrak | stekern: the latest patch set may also fix the low-speed regression you experienced. | 21:45 |
| wpwrak | stekern: at least it removes quite a bit of confusion i had added before :) | 21:46 |
| stekern | wpwrak: cool, do you keep those patches in a git repo somewhere? | 21:58 |
| wpwrak | only locally | 21:59 |
| stekern | ok, well, lekernel seems to be quite quick to apply them anyways | 22:00 |
| stekern | I need to sign up on the ML | 22:01 |
| wpwrak | yeah. he probably has his alarm clock connected to "grep PATCH" :) | 22:01 |
| mwalle | lekernel: (usb abuse) thats qemu and it used the hid layer in a strange way | 22:50 |
| mwalle | gerd and i fixed that some time ago ;) | 22:50 |
| --- Wed Nov 30 2011 | 00:00 | |
Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!