copyleft hardware planet

October 13, 2017

Bunnie Studios

Why I’m Using Bitmarks on my Products

One dirty secret of hardware is that a profitable business isn’t just about design innovation, or even product cost reduction: it’s also about how efficiently one can move stuff from point A to B. This explains the insane density of hardware suppliers around Shenzhen; it explains the success of Ikea’s flat-packed furniture model; and it explains the rise of Amazon’s highly centralized, highly automated warehouses.

Unfortunately, reverse logistics – the system for handling returns & exchanges of hardware products – is not something on the forefront of a hardware startup’s agenda. In order to deal with defective products, one has to ship a product first – an all-consuming goal. However, leaving reverse logistics as a “we’ll fix it after we ship” detail could saddle the venture with significant unanticipated customer support costs, potentially putting the entire business model at risk.

This is because logistics are much more efficient in the “forward” direction: the cost of a centralized warehouse to deliver packages to an end consumer’s home address is orders of magnitude less than it is for a residential consumer to mail that same parcel back to the warehouse. This explains the miracle of Amazon Prime, when overnighting a pair of hand-knit mittens to your mother somehow costs you $20. Now repeat the hand-knit mittens thought experiment and replace it with a big-screen TV that has to find its way back to a factory in Shenzhen. Because the return shipment can no longer take advantage of bulk shipping discounts, the postage to China is likely more than the cost of the product itself!

Because of the asymmetry in forward versus reverse logistics cost, it’s generally not cost effective to send defective material directly back to the original factory for refurbishing, recycling, or repair. In many cases the cost of the return label plus the customer support agent’s time will exceed the cost of the product. This friction in repatriating defective product creates opportunities for unscrupulous middlemen to commit warranty fraud.

The basic scam works like this: a customer calls in with a defective product and gets sent a replacement. The returned product is sent to a local processing center, where it may be declared unsalvageable and slated for disposal. However, instead of a proper disposal, the defective goods “escape” the processing center and are resold as new to a different customer. The duped customer then calls in to exchange the same defective product and gets sent a replacement. Rinse lather repeat, and someone gets rich quick selling scrap at full market value.

Similarly, high-quality counterfeits can sap profits from companies. Clones of products are typically produced using cut-rate or recycled parts but sold at full price. What happens when customers then find quality issues with the clone? That’s right – they call the authentic brand vendor and ask for an exchange. In this case, the brand makes zero money on the customer but incurs the full cost of supporting a defective product. This kind of warranty fraud is pandemic in smart phones and can cost producers many millions of dollars per year in losses.


High-quality clones, like the card on the left, can cost businesses millions of dollars in warranty fraud claims.

Serial numbers help mitigate these problems, but it’s easy to guess a simple serial number. More sophisticated schemes tie serial numbers to silicon IDs, but that necessitates a system which can reliably download the serialization data from the factory. This might seem a trivial task but for a lot of reasons – from failures in storage media to human error to poor Internet connectivity in factories – it’s much harder than it seems to make this happen. And for a startup, losing an entire lot of serialization data due to a botched upload could prove fatal.

As a result, most hardware startups ship products with little to no plan for product serialization, much less a plan for reverse logistics. When the first email arrives from an unhappy customer, panic ensues, and the situation is quickly resolved, but by the time the product arrives back at the factory, the freight charges alone might be in the hundreds of dollars. Repeat this exercise a few dozen times, and any hope for a profitable run is rapidly wiped out.

I’ve wrestled with this problem on and off through several startups of my own and finally landed on a solution that looks promising: it’s reasonably robust, fraud-resistant, and dead simple to implement. The key is the bitmark – a small piece of digital data that links physical products to the blockchain.

Most people are familiar with blockchains through Bitcoin. Bitcoin uses the blockchain as a public ledger to prevent double-spending of the same virtual coin. This same public ledger can be applied to physical hardware products through a bitmark. Products that have been bitmarked can have their provenance tracked back to the factory using the public ledger, thus hampering cloning and warranty fraud – the physical equivalent of double-spending a Bitcoin.

One of my most recent hardware startups, Chibitronics has teamed up with Bitmark to develop an end-to-end solution for Chibitronics’ newest microcontroller product, the Chibi Chip.

As an open hardware business, we welcome people to make their own versions of our product, but we can’t afford to give free Chibi Chips to customers that bought cut-rate clones and then report them as defective for a free upgrade to an authentic unit. We’re also an extremely lean startup, so we can’t afford the personnel to build a full serialization and reverse logistics system from scratch. This is where Bitmark comes in.

Bitmark has developed a turn-key solution for serialization and reverse logistics triage. They issue us bitmarks as lists of unique, six-word phrases. The six-word phrases are less frustrating for users to type in than strings of random characters. We then print the phrases onto labels that are stuck onto the back of each Chibi Chip.


Bitmark claim code on the back of a Chibi Chip

We release just enough of these pre-printed labels to the factory to run our authorized production quantities. This allows us to trace a bitmark back to a given production lot. It also prevents “ghost shifting” – that is, authorized factories producing extra bootleg units on a midnight shift that are sold into the market at deep discounts. Bitmark created a website for us where customers can then claim their bitmarks, thus registering their product and making it eligible for warranty service. In the event of an exchange or return, the product’s bitmark is updated to record this event. Then if a product fails to be returned to the factory, it can’t be re-claimed as defective because the blockchain ledger would evidence that bitmark as being mapped to a previously returned product. This allows us to defer the repatriation of the product to the factory. It also enables us to use unverified third parties to handle returned goods, giving us a large range of options to reduce reverse logistics costs.

Bitmark also plans to roll out a site where users can verify the provenance of their bitmarks, so buyers can check if a product’s bitmark is authentic and if it has been previously returned for problems before they buy it. This increases the buyer’s confidence, thus potentially boosting the resale value of used Chibi Chips.

For the cost and convenience of a humble printed label, Bitmark enhances control over our factories, enables production lot traceability, deters cloning, prevents warranty fraud, enhances confidence in the secondary market, and gives us ample options to streamline our reverse logistics.

Of course, the solution isn’t perfect. A printed label can be peeled off one product and stuck on another, so people could potentially just peel labels off good products and resell the labels to users with broken clones looking to upgrade by committing warranty fraud. This scenario could be mitigated by using tamper-resistant labels. And for every label that’s copied by a cloner, there’s one victim who will have trouble getting support on an authentic unit. Also, if users are generally lax about claiming their bitmark codes, it creates an opportunity for labels to be sparsely duplicated in an effort to ghost-shift/clone without being detected; but this can be mitigated with a website update that encouraging customers to immediately register their bitmarks before using the web-based services tied to the product. We also have to exercise care in handling lists of unclaimed phrases because, until a customer registers their bitmark claim phrase in the blockchain, the phrases have value to would-be fraudsters.

But overall, for the cost and convenience, the solution outperforms all the other alternatives I’ve explored to date. And perhaps most importantly for hardware startups like mine that are short on time and long on tasks, printing bitmarks is simple enough for us to implement that it’s hard to justify doing anything else.

Disclosure: I am a technical advisor and shareholder of Bitmark.

by bunnie at October 13, 2017 02:32 PM

October 09, 2017

Harald Welte

Invited keynote + TTCN-3 talk at netdevconf 2.2 in Seoul

It was a big surprise that I've recently been invited to give a keynote on netfilter history at netdevconf 2.2.

First of all, I wouldn't have expected netfilter to be that relevant next to all the other [core] networking topics at netdevconf. Secondly, I've not been doing any work on netfilter for about a decade now, so my memory is a bit rusty by now ;)

Speaking of Rusty: Timing wise there is apparently a nice coincidence that I'll be able to meet up with him in Berlin later this month, i.e. hopefully we can spend some time reminiscing about old times and see what kind of useful input he has for the keynote.

I'm also asking my former colleagues and successors in the netfilter project to share with me any note-worthy events or anecdotes, particularly also covering the time after my retirement from the core team. So if you have something that you believe shouldn't miss in a keynote on netfilter project history: Please reach out to me by e-mail ASAP and let me know about it.

To try to fend off the elder[ly] statesmen image that goes along with being invited to give keynotes about the history of projects you were working on a long time ago, I also submitted an actual technical talk: TTCN-3 and Eclipse Titan for testing protocol stacks, in which I'll cover my recent journey into TTCN-3 and TITAN land, and how I think those tools can help us in the Linux [kernel] networking community to productively produce tests for the various protocols.

As usual for netdevconf, there are plenty of other exciting talks in the schedule

I'm very much looking forward to both visiting Seoul again, as well as meeting lots of the excellent people involved in the Linux networking subsystems. See ya!

by Harald Welte at October 09, 2017 10:00 PM

October 08, 2017

Harald Welte

Ten years Openmoko Neo1973 release anniversary dinner

As I noted earlier this year, 2017 marks the tenth anniversary of shipping the first Openmoko phone, the Neo1973.

On this occasion, a number of the key people managed to gather for an anniversary dinner in Taipei. Thanks for everyone who could make it, it was very good to see them together again. Sadly, by far not everyone could attend. You have been missed!

The award for the most crazy attendee of the meeting goes out to my friend Milosch, who has actually flown from his home in the UK to Taiwan, only to meet up with old friends and attend the anniversary dinner.

You can some pictures in Milosch's related tweet.

by Harald Welte at October 08, 2017 10:00 PM

October 05, 2017

Free Electrons

Buildroot Long Term Support releases: from 2017.02 to 2017.02.6 and counting

Buildroot LogoBuildroot is a widely used embedded Linux build systems. A large number of companies and projects use Buildroot to produce customized embedded Linux systems for a wide range of embedded devices. Most of those devices are now connected to the Internet, and therefore subject to attacks if the software they run is not regularly updated to address security vulnerabilities.

The Buildroot project publishes a new release every three months, with each release providing a mix of new features, new packages, package updates, build infrastructure improvements… and security fixes. However, until earlier this year, as soon as a new version was published, the maintenance of the previous version stopped. This means that in order to stay up to date in terms of security fixes, users essentially had two options:

  1. Update their Buildroot version regularly. The big drawback is that they get not only security updates, but also many other package updates, which may be problematic when a system is in production.
  2. Stick with their original Buildroot version, and carefully monitor CVEs and security vulnerabilities in the packages they use, and update the corresponding packages, which obvisouly is a time-consuming process.

Starting with 2017.02, the Buildroot community has decided to offer one long term supported release every year: 2017.02 will be supported one year in terms of security updates and bug fixes, until 2018.02 is released. The usual three-month release cycle still applies, with 2017.05 and 2017.08 already being released, but users interested in a stable Buildroot version that is kept updated for security issues can stay on 2017.02.

Since 2017.02 was released on February 28th, 2017, six minor versions were published on a fairly regularly basis, almost every month, except in August:

With about 60 to 130 commits between each minor version, it is relatively easy for users to check what has been changed, and evaluate the impact of upgrading to the latest minor version to benefit from the security updates. The commits integrated in those minor versions are carefully chosen with the idea that users should be able to easily update existing systems.

In total, those six minor versions include 526 commits, of which 183 commits were security updates, representing roughly one third of the total number of commits. The other commits have been:

  • 140 commits to fix build issues
  • 57 commits to bump versions of packages for bug fixes. These almost exclusively include updates to the Linux kernel, using its LTS versions. For other packages, we are more conservative and generally don’t upgrade them.
  • 17 commits to address issues in the licensing description of the packages
  • 186 commits to fix miscellaneous issues, ranging from runtime issues affecting packages to bugs in the build infrastructure

The Buildroot community has already received a number of bug reports, patches or suggestions specifically targetting the 2017.02 LTS version, which indicates that developers and companies have started to adopt this LTS version.

Therefore, if you are interested in using Buildroot for a product, you should probably consider using the LTS version! We very much welcome feedback on this version, and help in monitoring the security vulnerabilities affecting software packages in Buildroot.

by Thomas Petazzoni at October 05, 2017 07:42 PM

October 04, 2017

Harald Welte

On Vacation

In case you're wondering about the lack of activity not only on this blog but also in git repositories, mailing lists and the like: I've been on vacation since September 13. It's my usual "one month in Taiwan" routine, during which I spend some time in Taipei, but also take several long motorbike tours around mostly rural Taiwan.

You can find the occasional snapshot in my twitter feed, such as the, pictures, here and there.

by Harald Welte at October 04, 2017 10:00 PM

October 02, 2017

Free Electrons

Free Electrons at Embedded and Kernel Recipes 2017

Kernel RecipesEmbedded RecipesThe Kernel Recipes conference has become over the last few years a very interesting conference, with an original single track format and a limited number of attendees, which fosters communication and networking. Held in Paris, France, it is obviously a conference of choice for Free Electrons engineers to attend and speak at. We participated to multiple editions, Free Electrons engineer Maxime Ripard gave a talk at the 2014 edition, while Thomas Petazzoni gave a talk at the 2013 edition.

In 2017, the organizers decided to complement the 3-day Kernel Recipes conference with a 1-day Embedded Recipes event, and Free Electrons will participate by having two engineers attend those events and give talks:

If you’re interested in attending one of those events, make sure to register on time, there are only 100 seats available!

by Thomas Petazzoni at October 02, 2017 01:59 PM

October 01, 2017

Michele's GNSS blog

Tribute to Prof. Kai Borre


Whist attending the latest ION GNSS+ conference I had the confirmation that Prof. Kai Borre disappeared this Summer. He has been a very important reference to me, especially in the early stages of my career. I am sure many other radio-navigation geodesists and DSP engineers will convene with me. I knew him pretty well and I could not find anywhere an epitaph, so I today feel compelled to my leave my tribute to him here and hope others will intimately share my feeling.


Rest in peace Kai. My gratitude, for inspiring me until your very last moment.

by noreply@blogger.com (Michele Bavaro) at October 01, 2017 07:52 PM

September 30, 2017

Bunnie Studios

Name that Ware, September 2017

The Ware for September 2017 is shown below.

And here is the underside of the plug-in module from the left hand side of the PCB:

Thanks to Chris for sending in this gorgeous ware. I really appreciate both the aesthetic beauty of this ware, as well as the exotic construction techniques employed.

by bunnie at September 30, 2017 03:34 PM

Winner, Name that Ware August 2017

The ware for August 2017 is the controller IC for a self-flashing (two-pin, T1 case) RGB LED. It’s photographed through the lens of the LED, which is why the die appears so distorted. Somehow, Piotr — the first poster — guessed it on the first try without much explanation. Congrats, email me for your prize!

by bunnie at September 30, 2017 03:32 PM

September 28, 2017

Free Electrons

Free Electrons opens a new office in Lyon, France

After Toulouse and Orange, Lyon is the third city chosen for opening a Free Electrons office. Since September 1st of this year (2017), Alexandre Belloni and Grégory Clement have been working more precisely in Oullins close to the subway and the train station. It is the first step to make the Lyon team grow, with the opportunity to welcome interns and engineers.


Their new desks are already crowded by many boards running our favorite system.

by Gregory Clement at September 28, 2017 07:04 AM

September 27, 2017

Free Electrons

Mali OpenGL support on Allwinner platforms with mainline Linux

As most people know, getting GPU-based 3D acceleration to work on ARM platforms has always been difficult, due to the closed nature of the support for such GPUs. Most vendors provide closed-source binary-only OpenGL implementations in the form of binary blobs, whose quality depend on the vendor.

This situation is getting better and better through vendor-funded initiatives like for the Broadcom VC4 and VC5, or through reverse engineering projects like Nouveau on Tegra SoCs, Etnaviv on Vivante GPUs, Freedreno on Qualcomm’s. However there are still GPUs where you do not have the option to use a free software stack: PowerVR from Imagination Technologies and Mali from ARM (even though there is some progress on the reverse engineering effort).

Allwinner SoCs are using either a Mali GPU from ARM or a PowerVR from Imagination Technologies, and therefore, support for OpenGL on those platforms using a mainline Linux kernel has always been a problem. This is also further complicated by the fact that Allwinner is mostly interested in Android, which uses a different C library that avoids its use in traditional glibc-based systems (or through the use of libhybris).

However, we are happy to announce that Allwinner gave us clearance to publish the userspace binary blobs that allows to get OpenGL supported on Allwinner platforms that use a Mali GPU from ARM, using a recent mainline Linux kernel. Of course, those are closed source binary blobs and not a nice fully open-source solution, but it nonetheless allows everyone to have OpenGL support working, while taking advantage of all the benefits of a recent mainline Linux kernel. We have successfully used those binary blobs on customer projects involving the Allwinner A33 SoCs, and they should work on all Allwinner SoCs using the Mali GPU.

In order to get GPU support to work on your Allwinner platform, you will need:

  • The kernel-side driver, available on Maxime Ripard’s Github repository. This is essentially the Mali kernel-side driver from ARM, plus a number of build and bug fixes to make it work with recent mainline Linux kernels.
  • The Device Tree description of the GPU. We introduced Device Tree bindings for Mali GPUs in the mainline kernel a while ago, so that Device Trees can describe such GPUs. Such description has been added for the Allwinner A23 and A33 SoCs as part of this commit.
  • The userspace blob, which is available on Free Electrons GitHub repository. It currently provides the r6p2 version of the driver, with support for both fbdev and X11 systems. Hopefully, we’ll gain access to newer versions in the future, with additional features (such as GBM support).

If you want to use it in your system, the first step is to have the GPU definition in your device tree if it’s not already there. Then, you need to compile the kernel module:

git clone https://github.com/mripard/sunxi-mali.git
cd sunxi-mali
export CROSS_COMPILE=$TOOLCHAIN_PREFIX
export KDIR=$KERNEL_BUILD_DIR
export INSTALL_MOD_PATH=$TARGET_DIR
./build.sh -r r6p2 -b
./build.sh -r r6p2 -i

It should install the mali.ko Linux kernel module into the target filesystem.

Now, you can copy the OpenGL userspace blobs that match your setup, most likely the fbdev or X11-dma-buf variant. For example, for fbdev:

git clone https://github.com/free-electrons/mali-blobs.git
cd mali-blobs
cp -a r6p2/fbdev/lib/lib_fb_dev/lib* $TARGET_DIR/usr/lib

You should be all set. Of course, you will have to link your OpenGL applications or libraries against those user-space blobs. You can check that everything works using OpenGL test programs such as es2_gears for example.

by Maxime Ripard at September 27, 2017 09:34 AM

September 21, 2017

Elphel

Long range multi-view stereo camera with 4 sensors

Figure 1. Four sensor stereo camera CAD model


Four-camera stereo rig prototype is capable of measuring distances thousands times exceeding the camera baseline over wide (60 by 45 degrees) field of view. With 150 mm distance between lenses it provides ranging data at 200 meters with 10% accuracy, production units will have higher accuracy. Initial implementation uses software post-processing, but the core part of the software (tile processor) is designed as FPGA simulation and will be moved to the actual FPGA of the camera for the real time applications.

Scroll down or just hyper-jump to Scene viewer for the links to see example images and reconstructed scenes.

Background

Most modern advances in the area of the visual 3d reconstruction are related to structure from motion (SfM) where high quality models are generated from the image sequences, including those from the uncalibrated cameras (such as cellphone ones). Another fast growing applications depend on the active ranging with either LIDAR scanning technology or time of flight (ToF) sensors.

Each of these methods has its limitations and while widespread smart phone cameras attracted most of the interest in the algorithms and software development, there are some applications where the narrow baseline (distance between the sensors is much smaller, than the distance to the objects) technology has advantages.

Such applications include autonomous vehicle navigation where other objects in the scene are moving and 3-d data is needed immediately (not when the complete image sequence is captured), and the elements to be ranged are ahead of the vehicle so previous images would not help much. ToF sensors are still very limited in range (few meters) and the scanning LIDAR systems are either slow to update or have very limited field of view. Passive (visual only) ranging may be desired for military applications where the system should stay invisible by not shining lasers around.

Technology snippets

Narrow baseline and subpixel resolution

The main challenge for the narrow baseline systems is that the distance resolution is much worse than the lateral one. The minimal resolved 3d element, voxel is very far from resembling a cube (as 2d pixels are usually squares) – with the dimensions we use: pixel size – 0.0022 mm, lens focal length f = 4.5 mm and the baseline of 150 mm such voxel at 100 m distance is 50 mm high by 50 mm wide and 32 meters deep. The good thing is that while the lateral resolution generally is just one pixel (can be better only with additional knowledge about the object), the depth resolution can be improved with reasonable assumptions by an order of magnitude by using subpixel resolution. It is possible when there are multiple shifted images of the same object (that for such high range to baseline ratio can safely be assumed fronto-parallel) and every object is presented in each image by multiple pixels. With 0.1 pixel resolution in disparity (or shift between the two images) the depth dimension of the voxel at 100 m distance is 3.2 meters. And as we need multiple pixel objects for the subpixel disparity resolution, the voxel lateral dimensions increase (there is a way to restore the lateral resolution to a single pixel in most cases). With fixed-width window for the image matching we use 8×8 pixel grid (16×16 pixel overlapping tiles) similar to what is used by some image/video compression algorithms (such as JPEG) the voxel dimensions at 100 meter range become 0.4 m x 0.4 m x 3.2 m. Still not a cube, but the difference is significantly less dramatic.

Subpixel accuracy and the lens distortions

Matching images with subpixel accuracy requires that lens optical distortion of each lens is known and compensated with the same or better precision. Most popular way to present lens distortions is to use radial distortion model where relation of distorted and ideal pin-hole camera image is expressed as polynomial of point radius, so in polar coordinates the angle stays the same while the radius changes. Fisheye lenses are better described with “f-theta” model, where linear radial distance in the focal plane corresponds to the angle between the lens axis and ray to the object.

Such radial models provide accurate results only with ideal lens elements and when such elements are assembled so that the axis of each individual lens element precisely matches axes of the other elements – both in position and orientation. In the real lenses each optical element has minor misalignment, and that limits the radial model. For the lenses we had dealt with and with 5MPix sensors it was possible to get down to 0.2 – 0.3 pixels, so we supplemented the radial distortion described by the analytical formula with the table-based residual image correction. Such correction reduced the minimal resolved disparity to 0.05 – 0.08 pixels.

Fixed vs. variable window image matching and FPGA

Modern multi-view stereo systems that work with wide baselines use elaborate algorithms with variable size windows when matching image pairs, down to single pixels. They aggregate data from the neighbor pixels at later processing stages, that allows them to handle occlusions and perspective distortions that make paired images different. With the narrow baseline system, ranging objects at distances that are hundreds to thousands times larger than the baseline, the difference in perspective distortions of the images is almost always very small. And as the only way to get subpixel resolution requires matching of many pixels at once anyway, use of the fixed size image tiles instead of the individual pixels does not reduce flexibility of the algorithm much.

Processing of the fixed-size image tiles promises significant advantage – hardware-accelerated pixel-level tile processing combined with the higher level software that operates with the per-tile data rather than with per-pixel one. Tile processing can be implemented within the FPGA-friendly stream processing paradigm leaving decision making to the software. Matching image tiles may be implemented using methods similar to those used for image and especially video compression where motion vector estimation is similar to calculation of the disparity between the stereo images and similar algorithms may be used, such as phase-only correlation (PoC).

Two dimensional array vs. binocular and inline camera rigs

Usually stereo cameras or fixed baseline multi-view stereo are binocular systems, with just two sensors. Less common systems have more than two lenses positioned along the same line. Such configurations improve the useful camera range (ability to measure near and far objects) and reduce ambiguity when dealing with periodic object structures. Even less common are the rigs where the individual cameras form a 2d structure.

In this project we used a camera with 4 sensors located in the corners of a square, so they are not co-linear. Correlation-based matching of the images depends on the detailed texture in the matched areas of the images – perfectly uniform objects produce no data for depth estimation. Additionally some common types of image details may be unsuitable for certain orientations of the camera baselines. Vertical concrete pole can be easily correlated by the two horizontally positioned cameras, but if the baseline is turned vertical, the same binocular camera rig would fail to produce disparity value. Similar is true when trying to capture horizontal features with the horizontal binocular system – such predominantly horizontal features are common when viewing near flat horizontal surfaces at high angles of incidents (almost parallel to the view direction).

With four cameras we process four image pairs – 2 horizontal (top and bottom) and 2 vertical (right and left), and depending on the application requirements for particular image region it is possible to combine correlation results of all 4 pairs, or just horizontal and vertical separately. When all 4 baselines have equal length it is easier to combine image data before calculating the precise location of the correlation maximums – 2 pairs can be combined directly, and the 2 others after rotating tiles by 90 degrees (swapping X and Y directions, transposing the tiles 2d arrays).

Image rectification and resampling

Many implementations of the multi-view stereo processing start with the image rectification that involves correction for the perspective and lens distortions, projection of the individual images to the common plane. Such projection simplifies image tiles matching by correlation, but as it involves resampling of the images, it either reduces resolution or requires upsampling and so increases required memory size and processing complexity.

This implementation does not require full de-warping of the images and related resampling with fractional pixel shifts. Instead we split geometric distortion of each lens into two parts:

  • common (average) distortion of all four lenses approximated by analytical radial distortion model, and
  • small residual deviation of each lens image transformation from the common distortion model

Common radial distortion parameters are used to calculate matching tile location in each image, and while integer rounded pixel shifts of the tile centers are used directly when selecting input pixel windows, the fractional pixel remainders are preserved and combined with the other image shifts in the FPGA tile processor. Matching of the images is performed in this common distorted space, the tile grid is also mapped to this presentation, not to the fully rectified rectilinear image.

Small individual lens deviations from the common distortion model are smooth 2-d functions over the 2-d image plane, they are interpolated from the calibration data stored for the lower resolution grid.

We use low distortion sorted lenses with matching focal lengths to make sure that the scale mismatch between the image tiles is less than tile size in the target subpixel intervals (0.1 pix). Low distortion requirement extends the distances range to the near objects, because with the higher disparity values matching tiles in the different images land to the differently distorted areas. Focal length matching allows to use modulated complex lapped transform (CLT) that similar to discrete Fourier transform (DFT) is invariant to shifts, but not to scaling (log-polar coordinates are not applicable here, as such transformation would deny shift invariance).

Enhancing images by correcting optical aberrations with space-variant deconvolution

Matching of the images acquired with the almost identical lenses is rather insensitive to the lens aberrations that degrade image quality (mostly reduce sharpness), especially in the peripheral image areas. Aberration correction is still needed to get sharp textures in the result 3d models over full field of view, the resolution of the modern sensors is usually better than what lenses can provide. Correction can be implemented with space-variant (different kernels for different areas of the image) deconvolution, we routinely use it for post-processing of Eyesis4π images. The DCT-based implementation is described in the earlier blog post.

Space-variant deconvolution kernels can absorb (be combined with during calibration processing) the individual lens deviations from the common distortion model, described above. Aberration correction and image rectification to the common image space can be performed simultaneously using the same processing resources.

Two dimensional vs. single dimensional matching along the epipolar lines

Common approach for matching image pairs is to replace the two-dimensional correlation with a single-dimensional task by correlating pixels along the epipolar lines that are just horizontal lines for horizontally built binocular systems with the parallel optical axes. Aggregation of the correlation maximums locations between the neighbor parallel lines of pixels is preformed in the image pixels domain after each line is processed separately.

For tile-based processing it is beneficial to perform a full 2-d correlation as the phase correlation is performed in the frequency domain, and after the pointwise multiplication during aberration correction the image tiles are already available in the 2d frequency domain. Two dimensional correlation implies aggregation of data from multiple scan lines, it can tolerate (and be used to correct) small lens misalignments, with appropriate filtering it can be used to detect (and match) linear features.

Implementation

Prototype camera

Experimental camera looks similar to Elphel regular H-camera – we just incorporated different sensor front ends (3d CAD model) that are used in Eyesis4π and added adjustment screws to align optical axes of the lenses (heading and tilt) and orientations of the image sensors (roll). Sensors are 5 Mpix 1/2″ format On Semiconductor MT9P006, lenses – Evetar N125B04530W.

We selected lenses with the same focal length within 1%, and calibrated the camera using our standard camera rotation machine and the target pattern. As we do not yet have production adjustment equipment and software, the adjustment took several iterations: calibrating the camera and measuring extrinsic parameters of each sensor front end, then rotating each of the adjustment screws according to spreadsheet-calculated values, and then re-running the whole calibration process again. Finally the calibration results: radial distortion parameters, SFE extrinsic parameters, vignetting and deconvolution kernels were converted to the form suitable for run-time application (now – during post-processing of the captured images).

Figure 2. Camera block diagram

This prototype still uses 3d-printed parts and such mounts proved to be not stable enough, so we had to add field calibration and write code for bundle adjustment of the individual imagers orientations from the 2-d correlation data for each of the 4 individual pairs.

Camera performance depends on the actual mechanical stability, software-compensation can only partially mitigate this misalignment problem and the precision of the distance measurements was reduced when cameras went off by more than 20 pixels after being carried in a backpack. Nevertheless the scene reconstruction remained possible.

Software

Multi-view stereo rigs are capable of capturing dynamic scenes so our goal is to make a real-time system with most of the heavy-weight processing be done in the FPGA.

One of the major challenges here is how to combine parallel and stream processing capabilities of the FPGA with the flexibility of the software needed for implementation of the advanced 3d reconstruction algorithms. This approach is to use the FPGA-based tile processor to perform uniform operations on the lists of “tiles” – fixed square overlapping windows in the images. FPGA processes tile data at the pixel level, while software operates the whole tiles.

Figure 2 shows the overall block diagram of the camera, Figure 3 illustrates details of the tile processor.

Figure 3. FPGA tile processor

Initial implementation does not contain actual FPGA processing, so far we only tested in FPGA some of the core functions – two dimensional 8×8 DCT-IV needed for both 16×16 CLT and ICLT. Current code consists of the two separate parts – one part (tile processor) simulates what will be moved to the FPGA (it handles image tiles at the pixel level), and the other one is what will remain software – it operates on the tile level and does not deal with the individual pixels. These two parts interact using shared system memory, tile processor has exclusive access to the dedicated image buffer and calibration data.

Each tile is 16×16 pixels square with 8 pixel overlap, software prepares tile list including:

  • tile center X,Y (for the virtual “center” image),
  • center disparity, so the each of the 4 image tiles will be shifted accordingly, and
  • the code of operation(s) to be performed on that tile.

Figure 4. Correlation processor

Tile processor performs all or some (depending on the tile operation codes) of the following operations:

  • Reads the tile tasks from the shared system memory.
  • Calculates locations and loads image and calibration data from the external image buffer memory (using on-chip memory to cache data as the overlapping nature of the tiles makes each pixel to participate on average in 4 neighbor tiles).
  • Converts tiles to frequency domain using CLT based on 2d DCT-IV and DST-IV.
  • Performs aberration correction in the frequency domain by pointwise multiplication by the calibration kernels.
  • Calculates correlation-related data (Figure 4) for the tile pairs, resulting in tile disparity and disparity confidence values for all pairs combined, and/or more specific correlation types by pointwise multiplication, inverse CLT to the pixel domain, filtering and local maximums extraction by quadratic interpolation or windowed center of mass calculation.
  • Calculates combined texture for the tile (Figure 5), using alpha channel to mask out pixels that do not match – this is the way how to effectively restore single-pixel lateral resolution after aggregating individual pixels to tiles. Textures can be combined after only programmed shifts according to specified disparity, or use additional shift calculated in the correlation module.
  • Calculates other integral values for the tiles (Figure 5), such as per-channel number of mismatched pixels – such data can be used for quick second-level (using tiles instead of pixels) correlation runs to determine which 3d volumes potentially have objects and so need regular (pixel-level) matching.
  • Finally tile processor saves results: correlation values and/or texture tile to the shared system memory, so software can access this data.

Figure 5. Texture processor

Single tile processor operation deals with the scene objects that would be projected to this tile’s 16×16 pixels square on the sensor of the virtual camera located in the center between the four actual physical cameras. The single pass over the tile data is limited not just laterally, but in depth also because for the tiles to correlate they have to have significant overlap. 50% overlap corresponds to the correlation offset range of ±8 pixels, better correlation contrast needs 75% overlap or ±4 pixels. The tile processor “probes” not all the voxels that project to the same 16×16 window of the virtual image, but only those that belong to the certain distance range – the distances that correspond to the disparities ±4 pixels from the value provided for the tile.

That means that a single processing pass over a tile captures data in a disparity space volume, or a macro-voxel of 8 pixels wide by 8 pixels high by 8 pixels deep (considering the central part of the overlapping volumes). And capturing the whole scene may require multiple passes for the same tile with different disparity. There are ways how to avoid full range disparity sweep (with 8 pixel increments) for all tiles – following surfaces and detecting occlusions and discontinuities, second-level correlation of tiles instead of the individual pixels.

Another reason for the multi-pass processing of the same tile is to refine the disparity measured by correlation. When dealing with subpixel coordinates of the correlation maximums – either located by quadratic approximation or by some form of center of mass evaluation, the calculated values may have bias and disparity histograms reveal modulation with the pixel period. Second “refine” pass, where individual tiles are shifted by the disparity measured in the previous pass reduces the residual offset of the correlation maximum to a fraction of a pixel and mitigates this type of bias. Tile shift here means a combination of the integer pixel shift of the source images and the fractional (in the ±0.5 pixel range) shift that is performed in the frequency domain by multiplication by the cosine/sine phase rotator.

Total processing time and/or required FPGA resources linearly depend on the number of required tile processor operations and the software may use several methods to reduce this number. In addition to the two approaches mentioned above (following surfaces and second-level correlation) it may be possible to reduce the field of view to a smaller area of interest, predict current frame scene from the previous frames (as in 2d video compression) – tile processor paradigm preserves flexibility of the various algorithms that may be used in the scene 3d reconstruction software stack.

Scene viewer

The viewer for the reconstructed scenes is here: https://community.elphel.com/3d+map (viewer source code).

Figure 6. 3d+map index page

Index page shows a map (you may select from several providers) with the markers for the locations of the captured scenes. On the left there is a vertical ribbon of the thumbnails – you may scroll it with a mouse wheel or by dragging.

Thumbnails are shown only for the markers that fit on screen, so zooming in on the map may reduce number of the visible thumbnails. When you select some thumbnail, the corresponding marker opens on the map, and one or several scenes are shown – one line per each scene (identified by the Unix timestamp code with fractional seconds) captured at the same locations.

The scene that matches the selected thumbnail is highlighted (as 4-th line in the Figure 6). Some scenes have different versions of reconstruction from the same source images – they are listed in the same line (like first line in the Figure 6). Links lead to the viewers of the selected scene/version.

Figure 7. Selection of the map / satellite imagery provider

We do not have ground truth models for the captured scenes build with the active scanners. Instead as the most interesting is ranging of the distant objects (hundreds of meters) it is possible to use publicly available satellite imagery and match it to the captured models. We had ideal view from Elphel office window – each crack on the pavement was visible in the satellite images so we could match them with the 3d model of the scene. Unfortunately they ruined it recently by replacing asphalt :-).

The scene viewer combines x3dom representation of the 3d scene and the re-sizable overlapping map view. You may switch the map imagery provider by clicking on the map icon as shown in the Figure 7.

The scene and map views are synchronized to each other, there are several ways of navigation in either 3d or map area:

  • drag the 3d view to rotate virtual camera without moving;
  • move cross-hair icon in the map view to rotate camera around vertical axis;
  • toggle button and adjust camera view elevation;
  • use scroll wheel over the 3d area to change camera zoom (field of view is indicated on the map);
  • drag with middle button pressed in the 3d view to move camera perpendicular to the view direction;
  • drag the the camera icon (green circle) on the map to move camera horizontally;
  • toggle button and move the camera vertically;
  • press a hotkey t over the 3d area to reset to the initial view: set azimuth and elevation same as captured;
  • press a hotkey r over the 3d area to set view azimuth as captured, elevation equal to zero (horizontal view).

Figure 8. 3D model to map comparison

Comparison of the 3d scene model and the map uses ball markers. By default these markers are one meter in diameter, the size can be changed on the settings () page.

Moving pointer over the 3d area with Ctrl key pressed causes the ball to follow the cursor at a distance where the view line intersects the nearest detected surface in the scene. It simultaneously moves the corresponding marker over the map view and indicates the measured distance.

Ctrl-click places the ball marker on the 3d scene and on the map. It is then possible to drag the marker over the map and read the ground truth distance. Dragging the marker over the 3d scene updates location on the map, but not the other way around, in edit mode mismatch data is used to adjust the captured scene location and orientation.

Program settings used during reconstruction limit the scene far distance to z = 1000 meters, all more distant objects are considered to be located at infinity. X3d allows to use images at infinity using backdrop element, but it is not flexible enough and is not supported by some other programs. In most models we place infinity textures to a large billboard at z = 10,000 meters, and it is where the ball marker will appear if placed on the sky or other far objects.

Figure 9. Settings and link to four images

The settings page () shown in the Figure 9 has a link to the four-image viewer (Figure 10). These four images correspond to the captured views and are almost “raw images” used for scene reconstruction. These images were subject to the optical aberration correction and are partially rectified – they are rendered as if they were captured by the same camera that has only strictly polynomial radial distortion.

Such images are not actually used in the reconstruction process, they are rendered only for the debug and demonstration purposes. The equivalent data exists in the tile processor only in the frequency domain form as an intermediate result, and was subject to just linear processing (to avoid possible unintended biases) so the images have some residual locally-checkerboard pattern that is due to the Bayer mosaic filter (discussed in the earlier blog). Textures that are generated from the combination of all four images have the contrast of such pattern significantly lower. It is possible to add some non-linear filtering at the very last stage of the texture generation.

Each scene model has a download link for the archive that contains the model itself as *.x3d file and Wavefront *.obj and *.mtl as well as the corresponding RGBA texture files as PNG images. Initially I missed the fact that x3d and obj formats have opposite direction of surface normals for the same triangular faces, so almost half of the Wavefront files still have incorrect (opposite direction) surface normals.

Results

Our initial plan was to test algorithms for the tile processor before implementing them in FPGA. The tile processor provides data for the disparity space image (DSI) – confidence value of having certain disparity for specified 2d position in the image, it also generates texture tiles.

When the tile processor code was written and tested, we still needed some software to visualize the results. DSI itself seemed promising (much better coverage than what I had with earlier experiments with binocular images), but when I tried to convert these textured tiles into viewable x3d model directly, it was a big disappointment. Result did not look like a 3d scene – there were many narrow triangles that made sense only when viewed almost directly from the camera actual location, a small lateral viewpoint movement – and the image was falling apart into something unrecognizable.

Figure 10. Four channel images (click for actual viewer with zoom/pan capability)

I was not able to find ready to use code and the plan to write a quick demo for the tile processor and generated DSI seemed less and less realistic. Eventually it took at least three times longer to get somewhat usable output than to develop DCT-based tile processor code itself.

Current software is still incomplete, lacks many needed features (it even does not cut off background so wires over the sky steal a lot of surrounding space), it runs slow (several minutes per single scene), but it does provide a starting point to evaluate performance of the long range 4-camera multi-view stereo system. Much of the intended functionality does not work without more parameter tuning, but we decided to postpone improvements to the next stage (when we will have cameras that are more stable mechanically) and instead try to capture more of very different scenes, process them in batch mode (keeping the same parameter values for all new scenes) and see what will be the output.

As soon as the program was able to produce somewhat coherent 3d model from the very first image set captured through Elphel office window, Oleg Dzhimiev started development of the web application that allows to match the models with the map data. After adding more image sets I noticed that the camera calibration did not hold. Each individual sub-camera performed nicely (they use thermally compensated mechanical design), but their extrinsic parameters did change and we had to add code for field calibration that uses image themselves. The best accuracy in disparity measurement over the field of view still requires camera poses to match ones used at full calibration, so later scenes with more developed misalignment (>20 pixels) are less precise than earlier (captured in Salt Lake City).

We do not have an established method to measure ranging precision for different distances to object – the disparity values are calculated together with the confidence and in lower confidence areas the accuracy is lower, including places where no ranging is possible due to the complete absence of the visible details in the images. Instead it is possible to compare distances in various scene models to those on the map and see where such camera is useful. With 0.1 pixel disparity resolution and 150 mm baseline we should be able to measure 300 m distances with 10% accuracy, and for many captured scene objects it already is not much worse. We now placed orders to machine the new camera parts that are needed to build a more mechanically stable rig. And parallel to upgrading the hardware, we’ll start migrating the tile processor code from Java to Verilog.

And what’s next? Elphel goal is to provide our users with the high performance hackable products and freedom to modify them in the ways and for the purposes we could not imagine ourselves. But it is fun to fantasize about at least some possible applications:

  • Obviously, self-driving cars – increased number of cameras located in a 2d pattern (square) results in significantly more robust matching even with low-contrast textures. It does not depend on sequential scanning and provides simultaneous data over wide field of view. Calculated confidence of distance measurements tells when alternative (active) ranging methods are needed – that would help to avoid infamous accident with a self-driving car that went under a truck.
  • Visual odometry for the drones would also benefit from the higher robustness of image matching.
  • Rovers on Mars or other planets using low-power passive (visual based) scene reconstruction.
  • Maybe self-flying passenger multicopters in the heavy 3d traffic? Sure they will all be equipped with some transponders, but what about aerial roadkills? Like a flock of geese that forced water landing.
  • High speed boating or sailing over uneven seas with active hydrofoils that can look ahead and adjust to the future waves.
  • Landing on the asteroids for physical (not just Bitcoin) mining? With 150 mm baseline such camera can comfortably operate within several hundred meters from the object, with 1.5 m that will scale to kilometers.
  • Cinematography: post-production depth of field control that would easily beat even the widest format optics, HDR with a pair of 4-sensor cameras, some new VFX?
  • Multi-spectral imaging where more spatially separate cameras with different bandpass filters can be combined to the same texture in the 3d scene.
  • Capturing underwater scenes and measuring how far the sea creatures are above the bottom.

by Andrey Filippov at September 21, 2017 05:40 AM

September 12, 2017

Open Hardware Repository

White Rabbit - 12-09-2017: PTP Trackhound smells the White Rabbit

The software PTP Track Hound which can capture and analyze PTP network traffic now understands White Rabbit TLVs. So the Track Hound can now sniff the tracks that the White Rabbit leaves behind.
Track Hound is made freely available by Meinberg. One may want to know that the source code is not available under an Open Licence.

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at September 12, 2017 07:35 AM

September 06, 2017

Free Electrons

Free Electrons at the Embedded Linux Conference Europe

The next Embedded Linux Conference Europe will take place on October 23-25 in Prague, Czech Republic.

Embedded Linux Conference Europe 2017

As usual, a significant part of the Free Electrons engineering team will participate to the conference and give talks on various topics:

In addition to the main ELCE conference, Thomas Petazzoni will participate to the Buildroot Developers Days, a 2-day hackaton organized on Saturday and Sunday prior to ELCE, and will participate to the Device Tree workshop organized on Thursday afternoon.

Once again, we’re really happy to participate to this conference, and looking forward to meeting again with a large number of Linux kernel and embedded Linux developers!

by Thomas Petazzoni at September 06, 2017 11:56 AM

September 05, 2017

Free Electrons

Linux 4.13 released, Free Electrons contributions

Linux 4.13 was released last Sunday by Linus Torvalds, and the major new features of this release were described in details by LWN in a set of articles: part 1 and part 2.

This release gathers 13006 non-merge commits, amongst which 239 were made by Free Electrons engineers. According to the LWN article on 4.13 statistics, this makes Free Electrons the 13th contributing company by number of commits, the 10th by lines changed.

The most important contributions from Free Electrons for this release have been:

  • In the RTC subsystem
    • Alexandre Belloni introduced a new method for registering RTC devices, with one step for the allocation, and one step for the registration itself, which allows to solve race conditions in a number of drivers.
    • Alexandre Belloni added support for exposing the non-volatile memory found in some RTC devices through the Linux kernel nvmem framework, making them usable from userspace. A few drivers were changed to use this new mechanism.
  • In the MTD/NAND subsystem
    • Boris Brezillon did a large number of fixes and minor improvements in the NAND subsystem, both in the core and in a few drivers.
    • Thomas Petazzoni contributed the support for on-die ECC, specifically with Micron NANDs. This allows to use the ECC calculation capabilities of the NAND chip itself, as opposed to using software ECC (calculated by the CPU) or ECC done by the NAND controller.
    • Thomas Petazzoni contributed a few improvements to the FSMC NAND driver, used on ST Spear platforms. The main improvement is to support the ->setup_data_interface() callback, which allows to configure optimal timings in the NAND controller.
  • Support for Allwinner ARM platforms
    • Alexandre Belloni improved the sun4i PWM driver to use the so-called atomic API and support hardware read out.
    • Antoine Ténart improved the sun4i-ss cryptographic engine driver to support the Allwinner A13 processor, in addition to the already supported A10.
    • Maxime Ripard contributed HDMI support for the Allwinner A10 processor (in the DRM subsystem) and a number of related changes to the Allwinner clock support.
    • Quentin Schulz improved the support for battery charging through the AXP20x PMIC, used on Allwinner platforms.
  • Support for Atmel ARM platforms
    • Alexandre Belloni added suspend/resume support for the Atmel SAMA5D2 clock driver. This is part of a larger effort to implement the backup mode for the SAMA5D2 processor.
    • Alexandre Belloni added suspend/resume support in the tcb_clksrc driver, used as for clocksource and clockevents on Atmel SAMA5D2.
    • Alexandre Belloni cleaned up a number of drivers, removing support for non-DT probing, which is possible now that the AVR32 architecture has been dropped. Indeed, the AVR32 processors used to share the same drivers as the Atmel ARM processors.
    • Alexandre Belloni added the core support for the backup mode on Atmel SAMA5D2, a suspend/resume state with significant power savings.
    • Boris Brezillon switched Atmel platforms to use the new binding for the EBI and NAND controllers.
    • Boris Brezillon added support for timing configuration in the Atmel NAND driver.
    • Quentin Schulz added suspend/resume support to the Bosch m_can driver, used on Atmel platforms.
  • Support for Marvell ARM platforms
    • Antoine Ténart contributed a completely new driver (3200+ lines of code) for the Inside Secure EIP197 cryptographic engine, used in the Marvell Armada 7K and 8K processors. He also subsequently contributed a number of fixes and improvements for this driver.
    • Antoine Ténart improved the existing mvmdio driver, used to communicate with Ethernet PHYs over MDIO on Marvell platforms to support the XSMI variant found on Marvell Armada 7K/8K, used to communicate with 10G capable PHYs.
    • Antoine Ténart contributed minimal support for 10G Ethernet in the mvpp2 driver, used on Marvell Armada 7K/8K. For now, the driver still relies on low-level initialization done by the bootloader, but additional changes in 4.14 and 4.15 will remove this limitation.
    • Grégory Clement added a new pinctrl driver to configure the pin-muxing on the Marvell Armada 37xx processors.
    • Grégory Clement did a large number of changes to the clock drivers used on the Marvell Armada 7K/8K processors to prepare the addition of pinctrl support.
    • Grégory Clement added support for Marvell Armada 7K/8K to the existing mvebu-gpio driver.
    • Thomas Petazzoni added support for the ICU, a specialized interrupt controller used on the Marvell Armada 7K/8K, for all devices located in the CP110 part of the processor.
    • Thomas Petazzoni removed a work-around to properly resume per-CPU interrupts on the older Marvell Armada 370/XP platforms.
  • Support for RaspberryPi platforms
    • Boris Brezillon added runtime PM support to the HDMI encoder driver used on RaspberryPi platforms, and contributed a few other fixes to the VC4 DRM driver.

It is worth mentioning that Miquèl Raynal, recently hired by Free Electrons, sees his first kernel patch merged: nand: fix wrong default oob layout for small pages using soft ecc.

Free Electrons engineers are not only contributors, but also maintainers of various subsystems in the Linux kernel, which means they are involved in the process of reviewing, discussing and merging patches contributed to those subsystems:

  • Maxime Ripard, as the Allwinner platform co-maintainer, merged 113 patches from other contributors
  • Boris Brezillon, as the MTD/NAND maintainer, merged 62 patches from other contributors
  • Alexandre Belloni, as the RTC maintainer and Atmel platform co-maintainer, merged 57 patches from other contributors
  • Grégory Clement, as the Marvell EBU co-maintainer, merged 47 patches from other contributors

Here is the commit by commit detail of our contributors to 4.13:

by Thomas Petazzoni at September 05, 2017 07:21 AM

September 02, 2017

Harald Welte

Purism Librem 5 campaign

There's a new project currently undergoing crowd funding that might be of interest to the former Openmoko community: The Purism Librem 5 campaign.

Similar to Openmoko a decade ago, they are aiming to build a FOSS based smartphone built on GNU/Linux without any proprietary drivers/blobs on the application processor, from bootloader to userspace.

Furthermore (just like Openmoko) the baseband processor is fully isolated, with no shared memory and with the Linux-running application processor being in full control.

They go beyond what we wanted to do at Openmoko in offering hardware kill switches for camera/phone/baseband/bluetooth. During Openmoko days we assumed it is sufficient to simply control all those bits from the trusted Linux domain, but of course once that might be compromised, a physical kill switch provides a completely different level of security.

I wish them all the best, and hope they can leave a better track record than Openmoko. Sure, we sold some thousands of phones, but the company quickly died, and the state of software was far from end-user-ready. I think the primary obstacles/complexities are verification of the hardware design as well as the software stack all the way up to the UI.

The budget of ~ 1.5 million seems extremely tight from my point of view, but then I have no information about how much Puri.sm is able to invest from other sources outside of the campaign.

If you're a FOSS developer with a strong interest in a Free/Open privacy-first smartphone, please note that they have several job openings, from Kernel Developer to OS Developer to UI Developer. I'd love to see some talents at work in that area.

It's a bit of a pity that almost all of the actual technical details are unspecified at this point (except RAM/flash/main-cpu). No details on the cellular modem/chipset used, no details on the camera, neither on the bluetooth chipset, wifi chipset, etc. This might be an indication of the early stage of their plannings. I would have expected that one has ironed out those questions before looking for funding - but then, it's their campaign and they can run it as they see it fit!

I for my part have just put in a pledge for one phone. Let's see what will come of it. In case you feel motivated by this post to join in: Please keep in mind that any crowdfunding campaign bears significant financial risks. So please make sure you made up your mind and don't blame my blog post for luring you into spending money :)

by Harald Welte at September 02, 2017 10:00 PM

September 01, 2017

Harald Welte

The sad state of voice support in cellular modems

Cellular modems have existed for decades and come in many shapes and kinds. They contain the cellular baseband processor, RF frontend, protocol stack software and anything else required to communicate with a cellular network. Basically a phone without display or input.

During the last decade or so, the vast majority of cellular modems come as LGA modules, i.e. a small PCB with all components on the top side (and a shielding can), which has contact pads on the bottom so you can solder it onto your mainboard. You can obtain them from vendors such as Sierra Wireless, u-blox, Quectel, ZTE, Huawei, Telit, Gemalto, and many others.

In most cases, the vendors now also solder those modules to small adapter boards to offer the same product in mPCIe form-factor. Other modems are directly manufactured in mPCIe or NGFF aka m.2 form-factor.

As long as those modems were still 2G / 2.5G / 2.75G, the main interconnection with the host (often some embedded system) was a serial UART. The Audio input/output for voice calls was made available as analog signals, ready to connect a microphone and spekaer, as that's what the cellular chipsets were designed for in the smartphones. In the Openmoko phones we also interfaced the audio of the cellular modem in analog, exactly for that reason.

From 3G onwards, the primary interface towards the host is now USB, with the modem running as a USB device. If your laptop contains a cellular modem, you will see it show up in the lsusb output.

From that point onwards, it would have made a lot of sense to simply expose the audio also via USB. Simply offer a multi-function USB device that has both whatever virutal serial ports for AT commands and network device for IP, and add a USB Audio device to it. It would simply show up as a "USB sound card" to the host, with all standard drivers working as expected. Sadly, nobody seems to have implemented this, at least not in a supported production version of their product

Instead, what some modem vendors have implemented as an ugly hack is the transport of 8kHz 16bit PCM samples over one of the UARTs. See for example the Quectel UC-20 or the Simcom SIM7100 which implement such a method.

All the others ignore any acess to the audio stream from software to a large part. One wonders why that is. From a software and systems architecture perspective it would be super easy. Instead, what most vendors do, is to expose a digital PCM interface. This is suboptimal in many ways:

  • there is no mPCIe standard on which pins PCM should be exposed
  • no standard product (like laptop, router, ...) with mPCIe slot will have anything connected to those PCM pins

Furthermore, each manufacturer / modem seems to support a different subset of dialect of the PCM interface in terms of

  • voltage (almost all of them are 1.8V, while mPCIe signals normally are 3.3V logic level)
  • master/slave (almost all of them insist on being a clock master)
  • sample format (alaw/ulaw/linear)
  • clock/bit rate (mostly 2.048 MHz, but can be as low as 128kHz)
  • frame sync (mostly short frame sync that ends before the first bit of the sample)
  • endianness (mostly MSB first)
  • clock phase (mostly change signals at rising edge; sample at falling edge)

It's a real nightmare, when it could be so simple. If they implemented USB-Audio, you could plug a cellular modem into any board with a mPCIe slot and it would simply work. As they don't, you need a specially designed mainboard that implements exactly the specific dialect/version of PCM of the given modem.

By the way, the most "amazing" vendor seems to be u-blox. Their Modems support PCM audio, but only the solder-type version. They simply didn't route those signals to the mPCIe slot, making audio impossible to use when using a connectorized modem. How inconvenient.

Summary

If you want to access the audio signals of a cellular modem from software, then you either

  • have standard hardware and pick one very specific modem model and hope this is available sufficiently long during your application, or
  • build your own hardware implementing a PCM slave interface and then pick + choose your cellular modem

On the Osmocom mpcie-breakout board and the sysmocom QMOD board we have exposed the PCM related pins on 2.54mm headers to allow for some separate board to pick up that PCM and offer it to the host system. However, such separate board hasn't been developed so far.

by Harald Welte at September 01, 2017 10:00 PM

First actual XMOS / XCORE project

For many years I've been fascinated by the XMOS XCore architecture. It offers a surprisingly refreshing alternative virtually any other classic microcontroller architectures out there. However, despite reading a lot about it years ago, being fascinated by it, and even giving a short informal presentation about it once, I've so far never used it. Too much "real" work imposes a high barrier to spending time learning about new architectures, languages, toolchains and the like.

Introduction into XCore

Rather than having lots of fixed-purpose built-in "hard core" peripherals for interfaces such as SPI, I2C, I2S, etc. the XCore controllers have a combination of

  • I/O ports for 1/4/8/16/32 bit wide signals, with SERDES, FIFO, hardware strobe generation, etc
  • Clock blocks for using/dividing internal or external clocks
  • hardware multi-threading that presents 8 logical threads on each core
  • xCONNECT links that can be used to connect multiple processors over 2 or 5 wires per direction
  • channels as a means of communication (similar to sockets) between threads, whether on the same xCORE or a remote core via xCONNECT
  • an extended C (xC) programming language to make use of parallelism, channels and the I/O ports

In spirit, it is like a 21st century implementation of some of the concepts established first with Transputers.

My main interest in xMOS has been the flexibility that you get in implementing not-so-standard electronics interfaces. For regular I2C, UART, SPI, etc. there is of course no such need. But every so often one encounters some interface that's very rately found (like the output of an E1/T1 Line Interface Unit).

Also, quite often I run into use cases where it's simply impossible to find a microcontroller with a sufficient number of the related peripherals built-in. Try finding a microcontroller with 8 UARTs, for example. Or one with four different PCM/I2S interfaces, which all can run in different clock domains.

The existing options of solving such problems basically boil down to either implementing it in hard-wired logic (unrealistic, complex, expensive) or going to programmable logic with CPLD or FPGAs. While the latter is certainly also quite interesting, the learning curve is steep, the tools anything but easy to use and the synthesising time (and thus development cycles) long. Furthermore, your board design will be more complex as you have that FPGA/CPLD and a microcontroller, need to interface the two, etc (yes, in high-end use cases there's the Zynq, but I'm thinking of several orders of magnitude less complex designs).

Of course one can also take a "pure software" approach and go for high-speed bit-banging. There are some ARM SoCs that can toggle their pins. People have reported rates like 14 MHz being possible on a Raspberry Pi. However, when running a general-purpose OS in parallel, this kind of speed is hard to do reliably over long term, and the related software implementations are going to be anything but nice to write.

So the XCore is looking like a nice alternative for a lot of those use cases. Where you want a microcontroller with more programmability in terms of its I/O capabilities, but not go as far as to go full-on with FPGA/CPLD development in Verilog or VHDL.

My current use case

My current use case is to implement a board that can accept four independent PCM inputs (all in slave mode, i.e. clock provided by external master) and present them via USB to a host PC. The final goal is to have a board that can be combined with the sysmoQMOD and which can interface the PCM audio of four cellular modems concurrently.

While XMOS is quite strong in the Audio field and you can find existing examples and app notes for I2S and S/PDIF, I couldn't find any existing code for a PCM slave of the given requirements (short frame sync, 8kHz sample rate, 16bit samples, 2.048 MHz bit clock, MSB first).

I wanted to get a feeling how well one can implement the related PCM slave. In order to test the slave, I decided to develop the matching PCM master and run the two against each other. Despite having never written any code for XMOS before, nor having used any of the toolchain, I was able to implement the PCM master and PCM slave within something like ~6 hours, including simulation and verification. Sure, one can certainly do that in much less time, but only once you're familiar with the tools, programming environment, language, etc. I think it's not bad.

The biggest problem was that the clock phase for a clocked output port cannot be configured, i.e. the XCore insists on always clocking out a new bit at the falling edge, while my use case of course required the opposite: Clocking oout new signals at the rising edge. I had to use a second clock block to generate the inverted clock in order to achieve that goal.

Beyond that 4xPCM use case, I also have other ideas like finally putting the osmo-e1-xcvr to use by combining it with an XMOS device to build a portable E1-to-USB adapter. I have no clue if and when I'll find time for that, but if somebody wants to join in: Let me know!

The good parts

Documentation excellent

I found the various pieces of documentation extremely useful and very well written.

Fast progress

I was able to make fast progress in solving the first task using the XMOS / Xcore approach.

Soft Cores developed in public, with commit log

You can find plenty of soft cores that XMOS has been developing on github at https://github.com/xcore, including the full commit history.

This type of development is a big improvement over what most vendors of smaller microcontrollers like Atmel are doing (infrequent tar-ball code-drops without commit history). And in the case of the classic uC vendors, we're talking about drivers only. In the XMOS case it's about the entire logic of the peripheral!

You can for example see that for their I2C core, the very active commit history goes back to January 2011.

xSIM simulation extremely helpful

The xTIMEcomposer IDE (based on Eclipse) contains extensive tracing support and an extensible near cycle accurate simulator (xSIM). I've implemented a PCM mater and PCM slave in xC and was able to simulate the program while looking at the waveforms of the logic signals between those two.

The bad parts

Unfortunately, my extremely enthusiastic reception of XMOS has suffered quite a bit over time. Let me explain why:

Hard to get XCore chips

While the product portfolio on on the xMOS website looks extremely comprehensive, the vast majority of the parts is not available from stock at distributors. You won't even get samples, and lead times are 12 weeks (!). If you check at digikey, they have listed a total of 302 different XMOS controllers, but only 35 of them are in stock. USB capable are 15. With other distributors like Farnell it's even worse.

I've seen this with other semiconductor vendors before, but never to such a large extent. Sure, some packages/configurations are not standard products, but having only 11% of the portfolio actually available is pretty bad.

In such situations, where it's difficult to convince distributors to stock parts, it would be a good idea for XMOS to stock parts themselves and provide samples / low quantities directly. Not everyone is able to order large trays and/or capable to wait 12 weeks, especially during the R&D phase of a board.

Extremely limited number of single-bit ports

In the smaller / lower pin-count parts, like the XU[F]-208 series in QFN/LQFP-64, the number of usable, exposed single-bit ports is ridiculously low. Out of the total 33 I/O lines available, only 7 can be used as single-bit I/O ports. All other lines can only be used for 4-, 8-, or 16-bit ports. If you're dealing primarily with serial interfaces like I2C, SPI, I2S, UART/USART and the like, those parallel ports are of no use, and you have to go for a mechanically much larger part (like XU[F]-216 in TQFP-128) in order to have a decent number of single-bit ports exposed. Those parts also come with twice the number of cores, memory, etc- which you don't need for slow-speed serial interfaces...

Change to a non-FOSS License

XMOS deserved a lot of praise for releasing all their soft IP cores as Free / Open Source Software on github at https://github.com/xcore. The License has basically been a 3-clause BSD license. This was a good move, as it meant that anyone could create derivative versions, whether proprietary or FOSS, and there would be virtually no license incompatibilities with whatever code people wanted to write.

However, to my very big disappointment, more recently XMOS seems to have changed their policy on this. New soft cores (released at https://github.com/xmos as opposed to the old https://github.com/xcore) are made available under a non-free license. This license is nothing like BSD 3-clause license or any other Free Software or Open Source license. It restricts the license to use the code together with an XMOS product, requires the user to contribute fixes back to XMOS and contains references to importand export control. This license is incopatible with probably any FOSS license in existance, making it impossible to write FOSS code on XMOS while using any of the new soft cores released by XMOS.

But even beyond that license change, not even all code is provided in source code format anymore. The new USB library (lib_usb) is provided as binary-only library, for example.

If you know anyone at XMOS management or XMOS legal with whom I could raise this topic of license change when transitioning from older sc_* software to later lib_* code, I would appreciate this a lot.

Proprietary Compiler

While a lot of the toolchain and IDE is based on open source (Eclipse, LLVM, ...), the actual xC compiler is proprietary.

by Harald Welte at September 01, 2017 10:00 PM

Open Hardware Repository

MasterFIP - Order of 130 masterFIP v4

After having validated the design (through 15 v3) we are now ready to produce 130 v4 masterFIP boards!
The plan is to install them in machines in LHC operation before the end of this year.

by Evangelia Gousiou (Evangelia.Gousiou@cern.ch) at September 01, 2017 04:23 PM

August 30, 2017

Open Hardware Repository

White Rabbit - 29-08-2017: Geodetic station connected with WR to UTC(MIKE)

MIKES, the centre for metrology and accreditation of Finland, has connected the Metsähovi Geodetic Research Station to the official time of Finland, UTC[MIKE]

Some quotes from the article Metsähovi connected to the official time of Finland:

The time transfer to Metsähovi, Kirkkonummi, occurs from the UTC-laboratory at VTT MIKES Metrology in Otaniemi via optical fibre using the White Rabbit protocol. VTT MIKES Metrology has been an early adopter of the White Rabbit technology for time transfer across long distances. White Rabbit was developed at CERN, the European Organization for Nuclear Research.

The measurements show, for example, how the travel time of light each way in a 50-kilometre fibre optic cable varies by approx. 7 nanoseconds within a 24-hour period as temperature changes affect the properties of the fibre optic cable, particularly its length.

The uncertainty of time transfer is expected to be 100 ps or better. The precision of frequency transfer is currently approx. 15 digits.

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at August 30, 2017 09:41 AM

August 29, 2017

Free Electrons

Free Electrons at the Linux Plumbers 2017 conference

The Linux Plumbers conference has established itself as a major conference in the Linux ecosystem, discussing numerous aspects of the low-level layers of the Linux software stack. Linux Plumbers is organized around a number of micro-conferences, plus a number of more regular talks.

Linux Plumbers 2017

Free Electrons already participated to several previous editions of Linux Plumbers, and will again participate to this year’s edition that takes place in Los Angeles on September 13-15. Free Electrons engineers Boris Brezillon, Alexandre Belloni, Grégory Clement and Thomas Petazzoni will attend the conference.

If you’re attending this conference, or are located in the Los Angeles area, and want to meet us, do not hesitate to drop us a line at info@free-electrons.com. You can also follow Free Electrons Twitter feed for updates during the conference.

by Thomas Petazzoni at August 29, 2017 11:43 AM

August 25, 2017

Open Hardware Repository

White Rabbit Switch - Software - WR Switch firmware v5.0.1 released

Since v5.0 was released we have found a few problems in the WR Switch software package. This new v5.0.1 release does not include new functionality but contains important hotfixes to the v5.0. The FPGA bitstream used in v5.0.1 is exactly the same as in 5.0, therefore those same calibration values apply. As for any other release, you can find all the links to download the firmware binaries and manuals on our v5.0.1 release wiki page

Main fixes include:
  • USB flashing which was broken in v5.0
  • PPSI pre-master state fix
  • make menuconfig fixes
  • SNMP fixes
  • Webinterface fixes
For the full list of solved issues please check:

We advise updating your v5.0 switches to include these latest fixes.

Greg Daniluk, Adam Wujek

by Grzegorz Daniluk (grzegorz.daniluk@cern.ch) at August 25, 2017 11:37 AM

August 19, 2017

Harald Welte

Osmocom jenkins test suite execution

Automatic Testing in Osmocom

So far, in many Osmocom projects we have unit tests next to the code. Those unit tests are executing test on a per-C-function basis, and typically use the respective function directly from a small test program, executed at make check time. The actual main program (like OsmoBSC or OsmoBTS) is not executed at that time.

We also have VTY testing, which specifically tests that the VTY has proper documentation for all nodes of all commands.

Then there's a big gap, and we have osmo-gsm-tester for testing a full cellular network end-to-end. It includes physical GSM modesm, coaxial distribution network, attenuators, splitter/combiners, real BTS hardware and logic to run the full network, from OsmoBTS to the core - both for OsmoNITB and OsmoMSC+OsmoHLR based networks.

However, I think a lot of testing falls somewhere in between, where you want to run the program-under-test (e.g. OsmoBSC), but you don't want to run the MS, BTS and MSC that normally surroudns it. You want to test it by emulating the BTS on the Abis sid and the MSC on the A side, and just test Abis and A interface transactions.

For this kind of testing, I have recently started to investigate available options and tools.

OsmoSTP (M3UA/SUA)

Several months ago, during the development of OsmoSTP, I disovered that the Network Programming Lab of Münster University of Applied Sciences led by Michael Tuexen had released implementations of the ETSI test suite for the M3UA and SUA members of the SIGTRAN protocol family.

The somewhat difficult part is that they are implemented in scheme, using the guile interpreter/compiler, as well as a C-language based execution wrapper, which then is again called by another guile wrapper script.

I've reimplemented the test executor in python and added JUnitXML output to it. This means it can feed the test results directly into Jenkins.

I've also cleaned up the Dockerfiles and related image generation for the osmo-stp-master, m3ua-test and sua-test images, as well as some scripts to actually execute them on one of the Builders. You can find related Dockerfiles as well as associtaed Makfiles in http://git.osmocom.org/docker-playground

The end result after integration with Osmocom jenkins can be seen in the following examples on jenkins.osmocom.org for M3UA and for SUA

Triggering the builds is currently periodic once per night, but we could of course also trigger them automatically at some later point.

OpenGGSN (GTP)

For OpenGGSN, during the development of IPv6 PDP context support, I wrote some test infrastructure and test cases in TTCN-3. Those test cases can be found at http://git.osmocom.org/osmo-ttcn3-hacks/tree/ggsn_tests

I've also packaged the GGSN and the test cases each into separate Docker containers called osmo-ggsn-latest and ggsn-test. Related Dockerfiles and Makefiles can again be found in http://git.osmocom.org/docker-playground - together with a Eclipse TITAN Docker base image using Debian Stretch called debian-stretch-titan

Using those TTCN-3 test cases with the TITAN JUnitXML logger plugin we can again integrate the results directly into Jenkins, whose results you can see at https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-ggsn-test/14/testReport/(root)/GGSN_Tests/

Further Work

I've built some infrastructure for Gb (NS/BSSGP), VirtualUm and other testing, but yet have to build Docker images and related jenkins integration for it. Stay tuned about that. Also, lots more actual tests cases are required. I'm very much looking forward to any contributions.

by Harald Welte at August 19, 2017 10:00 PM

August 18, 2017

Open Hardware Repository

1:8 Pulse/Frequency Distribution Amplifier - S/N005 phase-noise measurement at 10 MHz

Phase-noise measurements show a flat spur-free phase-noise and AM-noise floor of -162 dBc/Hz at >100 Hz offset from a 10 MHz carrier. Measurements at 5 MHz to follow.
See wiki for results.

by Anders Wallin (anders.wallin@mikes.fi) at August 18, 2017 06:38 AM

August 16, 2017

Free Electrons

Updated bleeding edge toolchains on toolchains.free-electrons.com

Two months ago, we announced a new service from Free Electrons: free and ready-to-use Linux cross-compilation toolchains, for a large number of architectures and C libraries, available at http://toolchains.free-electrons.com/.

Bleeding edge toolchain updates

All our bleeding edge toolchains have been updated, with the latest version of the toolchain components:

  • gcc 7.2.0, which was released 2 days ago
  • glibc 2.26, which was released 2 weeks ago
  • binutils 2.29
  • gdb 8.0

Those bleeding edge toolchains are now based on Buildroot 2017.08-rc2, which brings a nice improvement: the host tools (gcc, binutils, etc.) are no longer linked statically against gmp, mpfr and other host libraries. They are dynamically linked against them with an appropriate rpath encoded into the gcc and binutils binaries to find those shared libraries regardless of the installation location of the toolchain.

However, due to gdb 8.0 requiring a C++11 compiler on the host machine (at least gcc 4.8), our bleeding edge toolchains are now built in a Debian Jessie system instead of Debian Squeeze, which means that at least glibc 2.14 is needed on the host system to use them.

The only toolchains for which the tests are not successful are the MIPS64R6 toolchains, due to the Linux kernel not building properly for this architecture with gcc 7.x. This issue has already been reported upstream.

Stable toolchain updates

We haven’t changed the component versions of our stable toolchains, but we made a number of fixes to them:

  • The armv7m and m68k-coldfire toolchains have been rebuilt with a fixed version of elf2flt that makes the toolchain linker directly usable. This fixes building the Linux kernel using those toolchains.
  • The mips32r5 toolchain has been rebuilt with NaN 2008 encoding (instead of NaN legacy), which makes the resulting userspace binaries actually executable by the Linux kernel, which expects NaN 2008 encoding on mips32r5 by default.
  • Most mips toolchains for musl have been rebuilt, with Buildroot fixes for the creation of the dynamic linker symbolic link. This has no effect on the toolchain itself, but also the tests under Qemu to work properly and validate the toolchains.

Other improvements

We made a number of small improvements to the toolchains.free-electrons.com site:

  • Each architecture now has a page that lists all toolchain versions available. This allows to easily find a toolchain that matches your requirements (in terms of gcc version, kernel headers version, etc.). See All aarch64 toolchains for an example.
  • We added a FAQ as well as a news page.

As usual, we welcome feedback about our toolchains, either on our bug tracker or by mail at info@free-electrons.com.

by Thomas Petazzoni at August 16, 2017 08:20 PM

Open Hardware Repository

1:8 Pulse/Frequency Distribution Amplifier - S/N005 assembled and tested

Amplifier S/N 005 was assembled and tested, housing PDA2017.07 and FDA2017.07 boards.
Initial phase-noise tests show good performance, similar to the previous generation of the board.
The new PDA2017.07 design using IDT5PB1108 has very fast rise-time (to be measured), possibly a quite low output-impedance (to be fixed?) and a preliminary channel-to-channel output skew of max 250 ps.

by Anders Wallin (anders.wallin@mikes.fi) at August 16, 2017 01:29 PM

Free Electrons

Free Electrons proposes an I3C subsystem for the Linux kernel

MIPI I3C fact sheet, from the MIPI I3C white paper

MIPI I3C fact sheet, from the MIPI I3C white paper

At the end of 2016, the MIPI consortium has finalized the first version of its I3C specification, a new communication bus that aims at replacing older busses like I2C or SPI. According to the specification, I3C gets closer to SPI data rate while requiring less pins and adding interesting mechanisms like in-band interrupts, hotplug capability or automatic discovery of devices connected on the bus. In addition, I3C provides backward compatibility with I2C: I3C and legacy I2C devices can be connected on a common bus controlled by an I3C master.

For more details about I3C, we suggest reading the MIPI I3C Whitepaper, as unfortunately MIPI has not publicly released the specifications for this protocol.

For the last few months, Free Electrons engineer Boris Brezillon has been working with Cadence to develop a Linux kernel subsystem to support this new bus, as well as Cadence’s I3C master controller IP. We have now posted the first version of our patch series to the Linux kernel mailing list for review, and we already received a large number of very useful comments from the kernel community.

Free Electrons is proud to be pioneering the support for this new bus in the Linux kernel, and hopes to see other developers contribute to this subsystem in the near future!

by Boris Brezillon at August 16, 2017 01:10 PM

August 14, 2017

Bunnie Studios

Name that Ware, August 2017

The Ware for August 2017 is below.

I removed a bit of context to make it more difficult — if it proves unguessable I’ll zoom out slightly (or perhaps just leave one extra, crucial hint to consider).

by bunnie at August 14, 2017 04:23 PM

Winner, Name that Ware July 2017

The ware for July 2017 is a PMT (photomultiplier tube) module. I’d say wrm gets the prize this month, for getting that it’s a PMT driver first, and for linking to a schematic. :) That’s an easy way to win me over. Gratz, email me to claim your prize!

by bunnie at August 14, 2017 04:22 PM

August 08, 2017

Harald Welte

IPv6 User Plane support in Osmocom

Preface

Cellular systems ever since GPRS are using a tunnel based architecture to provide IP connectivity to cellular terminals such as phones, modems, M2M/IoT devices and the like. The MS/UE establishes a PDP context between itself and the GGSN on the other end of the cellular network. The GGSN then is the first IP-level router, and the entire cellular network is abstracted away from the User-IP point of view.

This architecture didn't change with EGPRS, and not with UMTS, HSxPA and even survived conceptually in LTE/4G.

While the concept of a PDP context / tunnel exists to de-couple the transport layer from the structure and type of data inside the tunneled data, the primary user plane so far has been IPv4.

In Osmocom, we made sure that there are no impairments / assumptions about the contents of the tunnel, so OsmoPCU and OsmoSGSN do not care at all what bits and bytes are transmitted in the tunnel.

The only Osmocom component dealing with the type of tunnel and its payload structure is OpenGGSN. The GGSN must allocate the address/prefix assigned to each individual MS/UE, perform routing between the external IP network and the cellular network and hence is at the heart of this. Sadly, OpenGGSN was an abandoned project for many years until Osmocom adopted it, and it only implemented IPv4.

This is actually a big surprise to me. Many of the users of the Osmocom stack are from the IT security area. They use the Osmocom stack to test mobile phones for vulnerabilities, analyze mobile malware and the like. As any penetration tester should be interested in analyzing all of the attack surface exposed by a given device-under-test, I would have assumed that testing just on IPv4 would be insufficient and over the past 9 years, somebody should have come around and implemented the missing bits for IPv6 so they can test on IPv6, too.

In reality, it seems nobody appears to have shared line of thinking and invested a bit of time in growing the tools used. Or if they did, they didn't share the related code.

In June 2017, Gerrie Roos submitted a patch for OpenGGSN IPv6 support that raised hopes about soon being able to close that gap. However, at closer sight it turns out that the code was written against a more than 7 years old version of OpenGGSN, and it seems to primarily focus on IPv6 on the outer (transport) layer, rather than on the inner (user) layer.

OpenGGSN IPv6 PDP Context Support

So in July 2017, I started to work on IPv6 PDP support in OpenGGSN.

Initially I thought How hard can it be? It's not like IPv6 is new to me (I joined 6bone under 3ffe prefixes back in the 1990ies and worked on IPv6 support in ip6tables ages ago. And aside from allocating/matching longer addresses, what kind of complexity does one expect?

After my initial attempt of implementation, partially mislead by the patch that was contributed against that 2010-or-older version of OpenGGSN, I'm surprised how wrong I was.

In IPv4 PDP contexts, the process of establishing a PDP context is simple:

  • Request establishment of a PDP context, set the type to IETF IPv4
  • Receive an allocated IPv4 End User Address
  • Optionally use IPCP (part of PPP) to reques and receive DNS Server IP addresses

So I implemented the identical approach for IPv6. Maintain a pool of IPv6 addresses, allocate one, and use IPCP for DNS. And nothing worked.

  • IPv6 PDP contexts assign a /64 prefix, not a single address or a smaller prefix
  • The End User Address that's part of the Signalling plane of Layer 3 Session Management and GTP is not the actual address, but just serves to generate the interface identifier portion of a link-local IPv6 address
  • IPv6 stateless autoconfiguration is used with this link-local IPv6 address inside the User Plane, after the control plane signaling to establish the PDP context has completed. This means the GGSN needs to parse ICMPv6 router solicitations and generate ICMPV6 router advertisements.

To make things worse, the stateless autoconfiguration is modified in some subtle ways to make it different from the normal SLAAC used on Ethernet and other media:

  • the timers / lifetimes are different
  • only one prefix is permitted
  • only a prefix length of 64 is permitted

A few days later I implemented all of that, but it still didn't work. The problem was with DNS server adresses. In IPv4, the 3GPP protocols simply tunnel IPCP frames for this. This makes a lot of sense, as IPCP is designed for point-to-point interfaces, and this is exactly what a PDP context is.

In IPv6, the corresponding IP6CP protocol does not have the capability to provision DNS server addresses to a PPP client. WTF? The IETF seriously requires implementations to do DHCPv6 over PPP, after establishing a point-to-point connection, only to get DNS server information?!? Some people suggested an IETF draft to change this butthe draft has expired in 2011 and we're still stuck.

While 3GPP permits the use of DHCPv6 in some scenarios, support in phones/modems for it is not mandatory. Rather, the 3GPP has come up with their own mechanism on how to communicate DNS server IPv6 addresses during PDP context activation: The use of containers as part of the PCO Information Element used in L3-SM and GTP (see Section 10.5.6.3 of 3GPP TS 24.008. They by the way also specified the same mechanism for IPv4, so there's now two competing methods on how to provision IPv4 DNS server information: IPCP and the new method.

In any case, after some more hacking, OpenGGSN can now also provide DNS server information to the MS/UE. And once that was implemented, I had actual live uesr IPv6 data over a full Osmocom cellular stack!

Summary

We now have working IPv6 User IP in OpenGGSN. Together with the rest of the Osmocom stack you can operate a private GPRS, EGPRS, UMTS or HSPA network that provide end-to-end transparent, routed IPv6 connectivity to mobile devices.

All in all, it took much longer than nneeded, and the following questions remain in my mind:

  • why did the IETF not specify IP6CP capabilities to configure DNS servers?
  • why the complex two-stage address configuration with PDP EUA allocation for the link-local address first and then stateless autoconfiguration?
  • why don't we simply allocate the entire prefix via the End User Address information element on the signaling plane? For sure next to the 16byte address we could have put one byte for prefix-length?
  • why do I see duplication detection flavour neighbour solicitations from Qualcomm based phones on what is a point-to-point link with exactly two devices: The UE and the GGSN?
  • why do I see link-layer source address options inside the ICMPv6 neighbor and router solicitation from mobile phones, when that option is specifically not to be used on point-to-point links?
  • why is the smallest prefix that can be allocated a /64? That's such a waste for a point-to-point link with a single device on the other end, and in times of billions of connected IoT devices it will just encourage the use of non-public IPv6 space (i.e. SNAT/MASQUERADING) while wasting large parts of the address space

Some of those choices would have made sense if one would have made it fully compatible with normal IPv6 like e.g. on Ethernet. But implementing ICMPv6 router and neighbor solicitation without getting any benefit such as ability to have multiple prefixes, prefixes of different lengths, I just don't understand why anyone ever thought You can find the code at http://git.osmocom.org/openggsn/log/?h=laforge/ipv6 and the related ticket at https://osmocom.org/issues/2418

by Harald Welte at August 08, 2017 10:00 PM

July 31, 2017

Open Hardware Repository

White Rabbit - 31-07-2017: WR Switch Production Test Suite published

For the past few months we were working with INCAA Computers BV on a new WR Switch Production Test Suite. This
system allows to verify during the production or after delivery that all the components of the WR Switch hardware work properly.
Please check the WRS PTS wiki page for all the sources and documentation.

by Grzegorz Daniluk (grzegorz.daniluk@cern.ch) at July 31, 2017 06:11 PM

July 27, 2017

Bunnie Studios

Name that Ware July 2017

The Ware for July 2017 is shown below.

Decided to do this one with the potting on to make it a smidgen more challenging.

by bunnie at July 27, 2017 04:17 AM

Winner, Name that Ware June 2017

The Ware for June 2017 is an ultrasonic delay line. Picked this beauty up while wandering the junk shops of Akihabara. There’s something elegant about the Old Ways that’s simply irresistible to me…back when the answer to all hard problems was not simply “transform it into the software domain and then compute the snot out of it”.

Grats to plum33 for nailing it! email me for your prize.

by bunnie at July 27, 2017 04:17 AM

July 19, 2017

Video Circuits

Photos From The Video Workshop

The video workshop Alex and I gave was one of the best I have delivered, all 15 attendees got to take home a working CHA/V module they built in the class, it's a hacked VGA signal generator that basically allows you to build a simple video synth by adding some home brew or off the shelf oscillators. We had a great mix of attendees and they were all from really interesting backgrounds and super engaged. Alex as usual did a nicely paced video synthesis tutorial and I then lead the theory and building part of the class. We rounded up with Alex leading a discussion around historical video synthesis work and then proceeded to enjoy the evening concerts that were also part of the fantastic Brighton modular meet. (Pics 3+9 here are from Fabrizio D'Amico who runs Video Hack Space) Thanks to Andrew for organising the amazing meet which hosted the workshop, Matt for making our panels last minute, George for helping us out on the day and Steve from Thonk for supplying some components for the kits.










by Chris (noreply@blogger.com) at July 19, 2017 02:39 AM

July 18, 2017

Harald Welte

Virtual Um interface between OsmoBTS and OsmocomBB

During the last couple of days, I've been working on completing, cleaning up and merging a Virtual Um interface (i.e. virtual radio layer) between OsmoBTS and OsmocomBB. After I started with the implementation and left it in an early stage in January 2016, Sebastian Stumpf has been completing it around early 2017, with now some subsequent fixes and improvements by me. The combined result allows us to run a complete GSM network with 1-N BTSs and 1-M MSs without any actual radio hardware, which is of course excellent for all kinds of testing scenarios.

The Virtual Um layer is based on sending L2 frames (blocks) encapsulated via GSMTAP UDP multicast packets. There are two separate multicast groups, one for uplink and one for downlink. The multicast nature simulates the shared medium and enables any simulated phone to receive the signal from multiple BTSs via the downlink multicast group.

/images/osmocom-virtum.png

In OsmoBTS, this is implemented via the new osmo-bts-virtual BTS model.

In OsmocomBB, this is realized by adding virtphy virtual L1, which speaks the same L1CTL protocol that is used between the real OsmcoomBB Layer1 and the Layer2/3 programs such as mobile and the like.

Now many people would argue that GSM without the radio and actual handsets is no fun. I tend to agree, as I'm a hardware person at heart and I am not a big fan of simulation.

Nevertheless, this forms the basis of all kinds of possibilities for automatized (regression) testing in a way and for layers/interfaces that osmo-gsm-tester cannot cover as it uses a black-box proprietary mobile phone (modem). It is also pretty useful if you're traveling a lot and don't want to carry around a BTS and phones all the time, or get some development done in airplanes or other places where operating a radio transmitter is not really a (viable) option.

If you're curious and want to give it a shot, I've put together some setup instructions at the Virtual Um page of the Osmocom Wiki.

by Harald Welte at July 18, 2017 10:00 PM

July 15, 2017

Bunnie Studios

That’s a Big Microscope…

I’ve often said that there are no secrets in hardware — you just need a bigger, better microscope.

I think I’ve found the limit to that statement. To give you an idea, here’s the “lightbulb” that powers the microscope:

It’s the size of a building, and it’s the Swiss Light Source. Actually, not all of that building is dedicated to this microscope, just one beamline of an X-ray synchrotron capable of producing photons at an energy of 6.5keV (X-rays) at a flux of close to a billion coherent photons per second — but still, it’s a big light bulb. It might be a while before you see one of these popping up in a hacker’s garage…err, hangar…somewhere.

The result? One can image, in 3-D and “non-destructively” (e.g., without having to delayer or etch away dielectrics), chips down to a resolution of 14.6nm.

That’s a pretty neat trick if you’re trying to reverse engineer modern silicon.

You can read the full article at Nature (“High Resolution non-destructive three-dimensional imaging of integrated circuits” by Mirko Holler et al). I’m a paying subscriber to Nature so I’m supposed to have access to the article, but at the moment, their paywall is throwing a null pointer exception. Once the paywall is fixed you can buy a copy of the article to read, but in the meantime, SciHub seems more reliable.

You get what you pay for, right?

by bunnie at July 15, 2017 01:55 PM

July 11, 2017

Elphel

Current video stream latency and a way to reduce it

Fig.1 Live stream latency testing

Recently we had an inquiry whether our cameras are capable of streaming low latency video. The short answer is yes, the camera’s average output latency for 1080p at 30 fps is ~16 ms. It is possible to reduce it to almost 0.5 ms with a few changes to the driver.

However the total latency of the system, from capturing to displaying, includes delays caused by network, pc, software and display.

In the results of the experiment (similar to this one) these delays contribute the most (around 40-50 ms) to the stream latency – at least, for the given equipment.


 

Goal

Measure the total latency of a live stream over network from 10393 camera.
 

Setup

  • Camera: NC393-F-CS
    • Resolution@fps: 1080p@30fps,  720p@60fps
    • Compression quality: 90%
    • Exposure time: 1.7 ms
    • Stream formats: mjpeg, rtsp
    • Sensor: MT9P001, 5MPx, 1/2.5″
    • Lens: Computar f=5mm, f/1.4, 1/2″
  • PC: Shuttle box, i7, 16GB RAM, GeForce GTX 560 Ti
  • Display: ASUS VS24A, 60Hz (=16.7ms), 5ms gtg
  • OS: Kubuntu 16.04
  • Network connection: 1Gbps, direct camera-PC via cable
  • Applications:
    • gstreamer
    • chrome, firefox
    • mplayer
    • vlc
  • Stopwatch: basic javascript

 

Notes

table{ border-collapse: collapse; } td{ padding:0px 5px; border:1px solid black; } th{ padding:5px; border:1px solid black; background:rgba(220,220,220,0.5); }


Table 1: Transfer times and data rate

Resolution/fps Image size1, KB Transfer time2, ms Data rate3, Mbps
720p/60 250 2 120
1080p/30 500 4 120

1 – average compressed (90%) image size
2 – time it takes to transfer a single image over network. Jitter is unknown. t = Image_size*1Gbps
3 – required bandwidth: rate = fps*Image_size

Camera output latency calculation

All numbers are for the given lens, sensor and camera setup and parameters. Briefly.

Sensor
Because of ERS each row’s latency is different. See tables 2 and 3.
 
Table 2: tROW and tTR

Resolution tROW1, us tTR2, us
720p 22.75 13.33
1080p 29.42 20
full res (2592×1936) 36.38 27

1 – row time, see datasheet. tROW = f(Width)
2 – time it takes to transfer a row over sensor cable, clock = 96MHz. tTR = Width/96MHz
 
Table 3: Average latency and the whole range.

Resolution tERS avg1, ms tERS whole range2, ms
720p 8 0.01-16
1080p 16 0.02-32

1 – average latency
2 – min – last row latency, max – 1st row latency

Exposure

tEXP < 1 ms – typical exposure time for outdoors. A display is bright enough to set 1.7 ms with the gains maxed.

Compressor

The compressor is implemented in fpga and works 3x times faster but needs a stripe of 20 rows in memory. Thus, the compressor will finish ~20/3*tROW after the whole image is read out.

tCMP = 20/3*tROW

Summary

tCAM = tERS + tEXP + tCMP

Since the image is read and compressed by fpga logic of the Zynq and this pipeline has been simulated we can be sure in these numbers.
 
Table 4: Average output latency + exposure

Resolution tCAM, ms
720p 9.9
1080p 17.9

Stopwatch accuracy

Not accurate. For simplicity, we will rely on the camera’s internal clock that time stamps every image, and take the javascript timer readings as unique labels, thus not caring what time they are showing.
 

Results

Fig.2 1080p 30fps

Fig.3 720p 60fps

 
GStreamer has shown the best results among the tested programs.
Since the camera fps is discrete the result is a multiple of 1/fps (see this article):

  • 30 fps => 33.3 ms
  • 60 fps => 16.7 ms

 

Resolution/fps Total Latency, ms Network+PC+SW latency, ms
720p@60fps 33.3-50 23.4-40.1
1080p@30fps 33.3-66.7 15.4-48.8

 

Possible improvements

Camera

Currently, the driver waits for the interrupt from the compressor that indicates the image is fully compressed and ready for transfer. Meanwhile one does not have to wait for the whole image but start the transfer when the minimum of the compressed is data ready.

There are 3 more interrupts related to the image pipeline events. One of them is “compression started” – switching to it can reduce the output latency to (10+20/3)*tROW or 0.4 ms for 720p and 0.5 ms for 1080p.

Other hardware and software

In addition to the most obvious improvements:

  • For wifi: use 5GHz over 2.4GHz – smaller jitter, non-overlapping channels
  • Lower latency software: for mjpeg use gstreamer or vlc (takes an extra effort to setup) over chrome or firefox because they do extra buffering

 

Links

 

Updates

 
Table 6: Camera ports

mjpeg rtsp
port 0 2323 554
port 1 2324 556
port 2 2325 558
port 3 2326 560

GStreamer pipelines

  • For mjpeg:

~$ gst-launch-1.0 souphttpsrc is-live=true location=http://192.168.0.9:2323/mimg ! jpegdec ! xvimagesink

  • For rtsp:

~$ gst-launch-1.0 rtspsrc is-live=true location=rtsp://192.168.0.9:554 ! rtpjpegdepay ! jpegdec ! xvimagesink

VLC

~$ vlc rtsp://192.168.0.9:554

Chrome/Firefox

Open http://192.168.0.9:2323/mimg

by Oleg Dzhimiev at July 11, 2017 05:33 PM

July 10, 2017

Free Electrons

Linux 4.12, Free Electrons contributions

Linus Torvalds has released the 4.12 Linux kernel a week ago, in what is the second biggest kernel release ever by number of commits. As usual, LWN had a very nice coverage of the major new features and improvements: first part, second part and third part.

LWN has also published statistics about the Linux 4.12 development cycles, showing:

  • Free Electrons as the #14 contributing company by number of commits, with 221 commits, between Broadcom (230 commits) and NXP (212 commits)
  • Free Electrons as the #14 contributing company number of changed lines, with 16636 lines changed, just two lines less than Mellanox
  • Free Electrons engineer and MTD NAND maintainer Boris Brezillon as the #17 most active contributor by number of lines changed.

Our most important contributions to this kernel release have been:

  • On Atmel AT91 and SAMA5 platforms:
    • Alexandre Belloni has continued to upstream the support for the SAMA5D2 backup mode, which is a very deep suspend to RAM state, offering very nice power savings. Alexandre touched the core code in arch/arm/mach-at91 as well as pinctrl and irqchip drivers
    • Boris Brezillon has converted the Atmel PWM driver to the atomic API of the PWM subsystem, implemented suspend/resume and did a number of fixes in the Atmel display controller driver, and also removed the no longer used AT91 Parallel ATA driver.
    • Quentin Schulz improved the suspend/resume hooks in the atmel-spi driver to support the SAMA5D2 backup mode.
  • On Allwinner platforms:
    • Mylène Josserand has made a number of improvements to the sun8i-codec audio driver that she contributed a few releases ago.
    • Maxime Ripard added devfreq support to dynamically change the frequency of the GPU on the Allwinner A33 SoC.
    • Quentin Schulz added battery charging and ADC support to the X-Powers AXP20x and AXP22x PMICs, found on Allwinner platforms.
    • Quentin Schulz added a new IIO driver to support the ADCs found on numerous Allwinner SoCs.
    • Quentin Schulz added support for the Allwinner A33 built-in thermal sensor, and used it to implement thermal throttling on this platform.
  • On Marvell platforms:
    • Antoine Ténart contributed Device Tree changes to describe the cryptographic engines found in the Marvell Armada 7K and 8K SoCs. For now only the Device Tree description has been merged, the driver itself will arrive in Linux 4.13.
    • Grégory Clement has contributed a pinctrl and GPIO driver for the Marvell Armada 3720 SoC (Cortex-A53 based)
    • Grégory Clement has improved the Device Tree description of the Marvell Armada 3720 and Marvell Armada 7K/8K SoCs and corresponding evaluation boards: SDHCI and RTC are now enabled on Armada 7K/8K, USB2, USB3 and RTC are now enabled on Armada 3720.
    • Thomas Petazzoni made a significant number of changes to the mvpp2 network driver, finally adding support for the PPv2.2 version of this Ethernet controller. This allowed to enable network support on the Marvell Armada 7K/8K SoCs.
    • Thomas Petazzoni contributed a number of fixes to the mv_xor_v2 dmaengine driver, used for the XOR engines on the Marvell Armada 7K/8K SoCs.
    • Thomas Petazzoni cleaned-up the MSI support in the Marvell pci-mvebu and pcie-aardvark PCI host controller drivers, which allowed to remove a no-longer used MSI kernel API.
  • On the ST SPEAr600 platform:
    • Thomas Petazzoni added support for the ADC available on this platform, by adding its Device Tree description and fixing a clock driver bug
    • Thomas did a number of small improvements to the Device Tree description of the SoC and its evaluation board
    • Thomas cleaned up the fsmc_nand driver, which is used for the NAND controller driver on this platform, removing lots of unused code
  • In the MTD NAND subsystem:
    • Boris Brezillon implemented a mechanism to allow vendor-specific initialization and detection steps to be added, on a per-NAND chip basis. As part of this effort, he has split into multiple files the vendor-specific initialization sequences for Macronix, AMD/Spansion, Micron, Toshiba, Hynix and Samsung NANDs. This work will allow in the future to more easily exploit the vendor-specific features of different NAND chips.
  • Other contributions:
    • Maxime Ripard added a display panel driver for the ST7789V LCD controller

In addition, several Free Electrons engineers are also maintainers of various kernel subsystems. During this release cycle, they reviewed and merged a number of patches from kernel contributors:

  • Maxime Ripard, as the Allwinner co-maintainer, merged 94 patches
  • Boris Brezillon, as the NAND maintainer and MTD co-maintainer, merged 64 patches
  • Alexandre Belloni, as the RTC maintainer and Atmel co-maintainer, merged 38 patches
  • Grégory Clement, as the Marvell EBU co-maintainer, merged 32 patches

The details of all our contributions for this release:

by Thomas Petazzoni at July 10, 2017 10:13 AM

July 09, 2017

Harald Welte

Ten years after first shipping Openmoko Neo1973

Exactly 10 years ago, on July 9th, 2007 we started to sell+ship the first Openmoko Neo1973. To be more precise, the webshop actually opened a few hours early, depending on your time zone. Sean announced the availability in this mailing list post

I don't really have to add much to my ten years [of starting to work on] Openmoko anniversary blog post a year ago, but still thought it's worth while to point out the tenth anniversary.

It was exciting times, and there was a lot of pioneering spirit: Building a Linux based smartphone with a 100% FOSS software stack on the application processor, including all drivers, userland, applications - at a time before Android was known or announced. As history shows, we'd been working in parallel with Apple on the iPhone, and Google on Android. Of course there's little chance that a small taiwanese company can compete with the endless resources of the big industry giants, and the many Neo1973 delays meant we had missed the window of opportunity to be the first on the market.

It's sad that Openmoko (or similar projects) have not survived even as a special-interest project for FOSS enthusiasts. Today, virtually all options of smartphones are encumbered with way more proprietary blobs than we could ever imagine back then.

In any case, the tenth anniversary of trying to change the amount of Free Softwware in the smartphone world is worth some celebration. I'm reaching out to old friends and colleagues, and I guess we'll have somewhat of a celebration party both in Germany and in Taiwan (where I'll be for my holidays from mid-September to mid-October).

by Harald Welte at July 09, 2017 02:00 PM

July 07, 2017

Open Hardware Repository

White Rabbit core collection - White Rabbit PTP Core v4.1 released

We have just released v4.1 of the WR PTP Core. You can find all the links to download the reference designs binaries and documentation on our release wiki page.

This release contains mainly fixes to the previous v4.0 stable release:
  • fixed PCIe reset for standalone operation
  • fixes to p2p mode in PPSi
  • fixed Rx termination scheme for Spartan6 PHY which made WRPC unable to work with some SFPs
  • fixes and updates to HDL board and platform wrappers
and also some new features like:
  • new Wishbone registers bank available to read WRPC diagnostics from user application
  • new wrpc-diags host tool to read diagnostics over PCIe or VME
  • built-in default init script that loads SFP calibration parameters and configures WRPC in Slave mode
  • new document WRPC Failures and Diagnostics

Thank you for all the bug reports and contributions. As always, we encourage you to try this fresh release on your boards.

Greg Daniluk for the WR PTP Core team

by Grzegorz Daniluk (grzegorz.daniluk@cern.ch) at July 07, 2017 02:21 PM

July 03, 2017

Open Hardware Repository

OHR Meta Project - 29-06-2017: Open Doors for Universal Embedded Design

The article Open Doors for Universal Embedded Design in Embedded Systems Engineering, written by Caroline Hayes, Senior Editor, reads:

Charged with finding cost-effective integration for multicore platforms, the European Union’s (EU) Artemis EMC2 project finished at the end of May this year. A further initiative with CERN could mean the spirit of co-operation and the principles of open hardware herald an era of innovation.

and

This collaboration is a new initiative. The PC/104 Consortium will provide design-in examples of new and mature boards, with a reference design, for others to use and create something new. Although the Sundance board is the only [PC/104] product on the CERN Open Hardware Repository, there will be more news in the summer, promises Christensen. “My goal is to get five designs within the first year,” he says, and he is actively working to promote to PC/104 Consortium members that there is a place where they can download—and upload—reference designs which are PC/104-compatible.

Read the full article.

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at July 03, 2017 02:14 PM

June 28, 2017

Video Circuits

Video Circuits Workshop 01/07/17

Alex and I are running another workshop, this time at Brighton Modular meet book here:
https://www.attenboroughcentre.com/events/902/video-synthesis-workshop/
We will also be running a video synthesis room all Sunday.
http://brightonmodularmeet.co.uk/brighton-modular-meet---about.html

Come hang out and enjoy the rest of the meet.

Some pics of the panels we had made by Matt for Jona's CH/AV project (which we will be making on the day). There is also a shot of some audio oscillators driving the CH/AV to a VGA monitor.







by Chris (noreply@blogger.com) at June 28, 2017 08:58 AM

June 26, 2017

Bunnie Studios

Name that Ware June 2017

The Ware for June 2017 is shown below.

If nobody can guess this one from just the pointy end of the stick, I’ll post a photo with more context…

by bunnie at June 26, 2017 08:06 PM

Winner, Name that Ware May 2017

The Ware for May 2017 is the “Lorentz and Hertz” carriage board from an HP Officejet Pro 8500. Congrats to MegabytePhreak for nailing both the make and model of the printer it came from! email me for your prize.

I found the name of the board to be endearing.

by bunnie at June 26, 2017 08:06 PM

June 19, 2017

Free Electrons

Free and ready-to-use cross-compilation toolchains

For all embedded Linux developers, cross-compilation toolchains are part of the basic tool set, as they allow to build code for a specific CPU architecture and debug it. Until a few years ago, CodeSourcery was providing a lot of high quality pre-compiled toolchains for a wide range of architectures, but has progressively stopped doing so. Linaro provides some freely available toolchains, but only targetting ARM and AArch64. kernel.org has a set of pre-built toolchains for a wider range of architectures, but they are bare metal toolchains (cannot build Linux userspace programs) and updated infrequently.

To fill in this gap, Free Electrons is happy to announce its new service to the embedded Linux community: toolchains.free-electrons.com.

Free Electrons toolchains

This web site provides a large number of cross-compilation toolchains, available for a wide range of architectures, in multiple variants. The toolchains are based on the classical combination of gcc, binutils and gdb, plus a C library. We currently provide a total of 138 toolchains, covering many combinations of:

  • Architectures: AArch64 (little and big endian), ARC, ARM (little and big endian, ARMv5, ARMv6, ARMv7), Blackfin, m68k (Coldfire and 68k), Microblaze (little and big endian), MIPS32 and MIPS64 (little and big endian, with various instruction set variants), NIOS2, OpenRISC, PowerPC and PowerPC64, SuperH, Sparc and Sparc64, x86 and x86-64, Xtensa
  • C libraries: GNU C library, uClibc-ng and musl
  • Versions: for each combination, we provide a stable version which uses slightly older but more proven versions of gcc, binutils and gdb, and we provide a bleeding edge version with the latest version of gcc, binutils and gdb.

After being generated, most of the toolchains are tested by building a Linux kernel and a Linux userspace, and booting it under Qemu, which allows to verify that the toolchain is minimally working. We plan on adding more tests to validate the toolchains, and welcome your feedback on this topic. Of course, not all toolchains are tested this way, because some CPU architectures are not emulated by Qemu.

The toolchains are built with Buildroot, but can be used for any purpose: build a Linux kernel or bootloader, as a pre-built toolchain for your favorite embedded Linux build system, etc. The toolchains are available in tarballs, together with licensing information and instructions on how to rebuild the toolchain if needed.

We are very much interested in your feedback about those toolchains, so do not hesitate to report bugs or make suggestions in our issue tracker!

This work was done as part of the internship of Florent Jacquet at Free Electrons.

by Thomas Petazzoni at June 19, 2017 07:52 AM

June 15, 2017

Harald Welte

How the Osmocom GSM stack is funded

As the topic has been raised on twitter, I thought I might share a bit of insight into the funding of the Osmocom Cellular Infrastructure Projects.

Keep in mind: Osmocom is a much larger umbrella project, and beyond the Networks-side cellular stack is home many different community-based projects around open source mobile communications. All of those have started more or less as just for fun projects, nothing serious, just a hobby [1]

The projects implementing the network-side protocol stacks and network elements of GSM/GPRS/EGPRS/UMTS cellular networks are somewhat the exception to that, as they have evolved to some extent professionalized. We call those projects collectively the Cellular Infrastructure projects inside Osmocom. This post is about that part of Osmocom only

History

From late 2008 through 2009, People like Holger and I were working on bs11-abis and later OpenBSC only in our spare time. The name Osmocom didn't even exist back then. There was a strong technical community with contributions from Sylvain Munaut, Andreas Eversberg, Daniel Willmann, Jan Luebbe and a few others. None of this would have been possible if it wasn't for all the help we got from Dieter Spaar with the BS-11 [2]. We all had our dayjob in other places, and OpenBSC work was really just a hobby. People were working on it, because it was where no FOSS hacker has gone before. It was cool. It was a big and pleasant challenge to enter the closed telecom space as pure autodidacts.

Holger and I were doing freelance contract development work on Open Source projects for many years before. I was mostly doing Linux related contracting, while Holger has been active in all kinds of areas throughout the FOSS software stack.

In 2010, Holger and I saw some first interest by companies into OpenBSC, including Netzing AG and On-Waves ehf. So we were able to spend at least some of our paid time on OpenBSC/Osmocom related contract work, and were thus able to do less other work. We also continued to spend tons of spare time in bringing Osmocom forward. Also, the amount of contract work we did was only a fraction of the many more hours of spare time.

In 2011, Holger and I decided to start the company sysmocom in order to generate more funding for the Osmocom GSM projects by means of financing software development by product sales. So rather than doing freelance work for companies who bought their BTS hardware from other places (and spent huge amounts of cash on that), we decided that we wanted to be a full solution supplier, who can offer a complete product based on all hardware and software required to run small GSM networks.

The only problem is: We still needed an actual BTS for that. Through some reverse engineering of existing products we figured out who one of the ODM suppliers for the hardware + PHY layer was, and decided to develop the OsmoBTS software to do so. We inherited some of the early code from work done by Andreas Eversberg on the jolly/bts branch of OsmocomBB (thanks), but much was missing at the time.

What follows was Holger and me working several years for free [3], without any salary, in order to complete the OsmoBTS software, build an embedded Linux distribution around it based on OE/poky, write documentation, etc. and complete the first sysmocom product: The sysmoBTS 1002

We did that not because we want to get rich, or because we want to run a business. We did it simply because we saw an opportunity to generate funding for the Osmocom projects and make them more sustainable and successful. And because we believe there is a big, gaping, huge vacuum in terms of absence of FOSS in the cellular telecom sphere.

Funding by means of sysmocom product sales

Once we started to sell the sysmoBTS products, we were able to fund Osmocom related development from the profits made on hardware / full-system product sales. Every single unit sold made a big contribution towards funding both the maintenance as well as the ongoing development on new features.

This source of funding continues to be an important factor today.

Funding by means of R&D contracts

The probably best and most welcome method of funding Osmocom related work is by means of R&D projects in which a customer funds our work to extend the Osmocom GSM stack in one particular area where he has a particular need that the existing code cannot fulfill yet.

This kind of project is the ideal match, as it shows where the true strength of FOSS is: Each of those customers did not have to fund the development of a GSM stack from scratch. Rather, they only had to fund those bits that were missing for their particular application.

Our reference for this is and has been On-Waves, who have been funding development of their required features (and bug fixing etc.) since 2010.

We've of course had many other projects from a variety of customers over over the years. Last, but not least, we had a customer who willingly co-funded (together with funds from NLnet foundation and lots of unpaid effort by sysmocom) the 3G/3.5G support in the Osmocom stack.

The problem here is:

  • we have not been able to secure anywhere nearly as many of those R&D projects within the cellular industry, despite believing we have a very good foundation upon which we can built. I've been writing many exciting technical project proposals
  • you almost exclusively get funding only for new features. But it's very hard to get funding for the core maintenance work. The bug-fixing, code review, code refactoring, testing, etc.

So as a result, the profit margin you have on selling R&D projects is basically used to (do a bad job of) fund those bits and pieces that nobody wants to pay for.

Funding by means of customer support

There is a way to generate funding for development by providing support services. We've had some success with this, but primarily alongside the actual hardware/system sales - not so much in terms of pure software-only support.

Also, providing support services from a R&D company means:

  • either you distract your developers by handling support inquiries. This means they will have less time to work on actual code, and likely get side tracked by too many issues that make it hard to focus
  • or you have to hire separate support staff. This of course means that the size of the support business has to be sufficiently large to not only cover the cots of hiring + training support staff, but also still generate funding for the actual software R&D.

We've tried shortly with the second option, but fallen back to the first for now. There's simply not sufficient user/admin type support business to rectify dedicated staff for that.

Funding by means of cross-subsizing from other business areas

sysmocom also started to do some non-Osmocom projects in order to generate revenue that we can feed again into Osmocom projects. I'm not at liberty to discuss them in detail, but basically we've been doing pretty much anything from

  • custom embedded Linux board designs
  • M2M devices with GSM modems
  • consulting gigs
  • public tendered research projects

Profits from all those areas went again into Osmocom development.

Last, but not least, we also operate the sysmocom webshop. The profit we make on those products also is again immediately re-invested into Osmocom development.

Funding by grants

We've had some success in securing funding from NLnet Foundation for specific features. While this is useful, the size of their projects grants of up to EUR 30k is not a good fit for the scale of the tasks we have at hand inside Osmocom. You may think that's a considerable amount of money? Well, that translates to 2-3 man-months of work at a bare cost-covering rate. At a team size of 6 developers, you would theoretically have churned through that in two weeks. Also, their focus is (understandably) on Internet and IT security, and not so much cellular communications.

There are of course other options for grants, such as government research grants and the like. However, they require long-term planning, they require you to match (i.e. pay yourself) a significant portion, and basically mandate that you hire one extra person for doing all the required paperwork and reporting. So all in all, not a particularly attractive option for a very small company consisting of die hard engineers.

Funding by more BTS ports

At sysmocom, we've been doing some ports of the OsmoBTS + OsmoPCU software to other hardware, and supporting those other BTS vendors with porting, R&D and support services.

If sysmocom was a classic BTS vendor, we would not help our "competition". However, we are not. sysmocom exists to help Osmocom, and we strongly believe in open systems and architectures, without a single point of failure, a single supplier for any component or any type of vendor lock-in.

So we happily help third parties to get Osmocom running on their hardware, either with a proprietary PHY or with OsmoTRX.

However, we expect that those BTS vendors also understand their responsibility to share the development and maintenance effort of the stack. Preferably by dedicating some of their own staff to work in the Osmocom community. Alternatively, sysmocom can perform that work as paid service. But that's a double-edged sword: We don't want to be a single point of failure.

Osmocom funding outside of sysmocom

Osmocom is of course more than sysmocom. Even for the cellular infrastructure projects inside Osmocom is true: They are true, community-based, open, collaborative development projects. Anyone can contribute.

Over the years, there have been code contributions by e.g. Fairwaves. They, too, build GSM base station hardware and use that as a means to not only recover the R&D on the hardware, but also to contribute to Osmocom. At some point a few years ago, there was a lot of work from them in the area of OsmoTRX, OsmoBTS and OsmoPCU. Unfortunately, in more recent years, they have not been able to keep up the level of contributions.

There are other companies engaged in activities with and around Osmcoom. There's Rhizomatica, an NGO helping indigenous communities to run their own cellular networks. They have been funding some of our efforts, but being an NGO helping rural regions in developing countries, they of course also don't have the deep pockets. Ideally, we'd want to be the ones contributing to them, not the other way around.

State of funding

We're making some progress in securing funding from players we cannot name [4] during recent years. We're also making occasional progress in convincing BTS suppliers to chip in their share. Unfortunately there are more who don't live up to their responsibility than those who do. I might start calling them out by name one day. The wider community and the public actually deserves to know who plays by FOSS rules and who doesn't. That's not shaming, it's just stating bare facts.

Which brings us to:

  • sysmocom is in an office that's actually too small for the team, equipment and stock. But we certainly cannot afford more space.
  • we cannot pay our employees what they could earn working at similar positions in other companies. So working at sysmocom requires dedication to the cause :)
  • Holger and I have invested way more time than we have ever paid us, even more so considering the opportunity cost of what we would have earned if we'd continued our freelance Open Source hacker path
  • we're [just barely] managing to pay for 6 developers dedicated to Osmocom development on our payroll based on the various funding sources indicated above

Nevertheless, I doubt that any such a small team has ever implemented an end-to-end GSM/GPRS/EGPRS network from RAN to Core at comparative feature set. My deepest respects to everyone involved. The big task now is to make it sustainable.

Summary

So as you can see, there's quite a bit of funding around. However, it always falls short of what's needed to implement all parts properly, and even not quite sufficient to keep maintaining the status quo in a proper and tested way. That can often be frustrating (mostly to us but sometimes also to users who run into regressions and oter bugs). There's so much more potential. So many things we wanted to add or clean up for a long time, but too little people interested in joining in, helping out - financially or by writing code.

On thing that is often a challenge when dealing with traditional customers: We are not developing a product and then selling a ready-made product. In fact, in FOSS this would be more or less suicidal: We'd have to invest man-years upfront, but then once it is finished, everyone can use it without having to partake in that investment.

So instead, the FOSS model requires the customers/users to chip in early during the R&D phase, in order to then subsequently harvest the fruits of that.

I think the lack of a FOSS mindset across the cellular / telecom industry is the biggest constraining factor here. I've seen that some 20-15 years ago in the Linux world. Trust me, it takes a lot of dedication to the cause to endure this lack of comprehension so many years later.

[1]just like Linux has started out.
[2]while you will not find a lot of commits from Dieter in the code, he has been playing a key role in doing a lot of prototyping, reverse engineering and debugging!
[3]sysmocom is 100% privately held by Holger and me, we intentionally have no external investors and are proud to never had to take a bank loan. So all we could invest was our own money and, most of all, time.
[4]contrary to the FOSS world, a lot of aspects are confidential in business, and we're not at liberty to disclose the identities of all our customers

by Harald Welte at June 15, 2017 10:00 PM

FOSS misconceptions, still in 2017

The lack of basic FOSS understanding in Telecom

Given that the Free and Open Source movement has been around at least since the 1980ies, it puzzles me that people still seem to have such fundamental misconceptions about it.

Something that really triggered me was an article at LightReading [1] which quotes Ulf Ewaldsson, a leading Ericsson excecutive with

"I have yet to understand why we would open source something we think is really good software"

This completely misses the point. FOSS is not about making a charity donation of a finished product to the planet.

FOSS is about sharing the development costs among multiple players, and avoiding that everyone has to reimplement the wheel. Macro-Economically, it is complete and utter nonsense that each 3GPP specification gets implemented two dozens of times, by at least a dozen of different entities. As a result, products are way more expensive than needed.

If large Telco players (whether operators or equipment manufacturers) were to collaboratively develop code just as much as they collaboratively develop the protocol specifications, there would be no need for replicating all of this work.

As a result, everyone could produce cellular network elements at reduced cost, sharing the R&D expenses, and competing in key areas, such as who can come up with the most energy-efficient implementation, or can produce the most reliable hardware, the best receiver sensitivity, the best and most fair scheduling implementation, or whatever else. But some 80% of the code could probably be shared, as e.g. encoding and decoding messages according to a given publicly released 3GPP specification document is not where those equipment suppliers actually compete.

So my dear cellular operator executives: Next time you're cursing about the prohibitively expensive pricing that your equipment suppliers quote you: You only have to pay that much because everyone is reimplementing the wheel over and over again.

Equally, my dear cellular infrastructure suppliers: You are all dying one by one, as it's hard to develop everything from scratch. Over the years, many of you have died. One wonders, if we might still have more players left, if some of you had started to cooperate in developing FOSS at least in those areas where you're not competing. You could replicate what Linux is doing in the operating system market. There's no need in having a phalanx of different proprietary flavors of Unix-like OSs. It's way too expansive, and it's not an area in which most companies need to or want to compete anyway.

Management Summary

You don't first develop and entire product until it is finished and then release it as open source. This makes little economic sense in a lot of cases, as you've already invested into developing 100% of it. Instead, you actually develop a new product collaboratively as FOSS in order to not have to invest 100% but maybe only 30% or even less. You get a multitude of your R&D investment back, because you're not only getting your own code, but all the other code that other community members implemented. You of course also get other benefits, such as peer review of the code, more ideas (not all bright people work inside one given company), etc.

[1]that article is actually a heavily opinionated post by somebody who appears to be pushing his own anti-FOSS agenda for some time. The author is misinformed about the fact that the TIP has always included projects under both FRAND and FOSS terms. As a TIP member I can attest to that fact. I'm only referencing it here for the purpose of that that Ericsson quote.

by Harald Welte at June 15, 2017 10:00 PM

June 13, 2017

Free Electrons

Elixir Cross Referencer: new way to browse kernel sources

Today, we are pleased to announce the initial release of the Elixir Cross-Referencer, or just “Elixir”, for short.

What is Elixir?

Elixir home pageSince 2006, we have provided a Linux source code cross-referencing online tool as a service to the community. The engine behind this website was LXR, a Perl project almost as old as the kernel itself. For the first few years, we used the then-current 0.9.5 version of LXR, but in early 2009 and for various reasons, we reverted to the older 0.3.1 version (from 1999!). In a nutshell, it was simpler and it scaled better.

Recently, we had the opportunity to spend some time on it, to correct a few bugs and to improve the service. After studying the Perl source code and trying out various cross-referencing engines (among which LXR 2.2 and OpenGrok), we decided to implement our own source code cross-referencing engine in Python.

Why create a new engine?

Our goal was to extend our existing service (support for multiple projects, responsive design, etc.) while keeping it simple and fast. When we tried other cross-referencing engines, we were dissatisfied with their relatively low performance on a large codebase such as Linux. Although we probably could have tweaked the underlying database engine for better performance, we decided it would be simpler to stick to the strategy used in LXR 0.3: get away from the relational database engine and keep plain lists in simple key-value stores.

Another reason that motivated a complete rewrite was that we wanted to provide an up-to-date reference (including the latest revisions) while keeping it immutable, so that external links to the source code wouldn’t get broken in the future. As a direct consequence, we would need to index many different revisions for each project, with potentially a lot of redundant information between them. That’s when we realized we could leverage the data model of Git to deal with this redundancy in an efficient manner, by indexing Git blobs, which are shared between revisions. In order to make sure queries under this strategy would be fast enough, we wrote a proof-of-concept in Python, and thus Elixir was born.

What service does it provide?

First, we tried to minimize disruption to our users by keeping the user interface close to that of our old cross-referencing service. The main improvements are:

  • We now support multiple projects. For now, we provide reference for Linux, Busybox and U-Boot.
  • Every tag in each project’s git repository is now automatically indexed.
  • The design has been modernized and now fits comfortably on smaller screens like tablets.
  • The URL scheme has been simplified and extended with support for multiple projects. An HTTP redirector has been set up for backward compatibility.
Elixir supports multiple projects

Elixir supports multiple projects

Among other smaller improvements, it is now possible to copy and paste code directly without line numbers getting in the way.

How does it work?

Elixir is made of two Python scripts: “update” and “query”. The first looks for new tags and new blobs inside a Git repository, parses them and appends the new references to identifiers to a record inside the database. The second uses the database and the Git repository to display annotated source code and identifier references.

The parsing itself is done with Ctags, which provides us with identifier definitions. In order to find the references to these identifiers, Elixir then simply checks each lexical token in the source file against the definition database, and if that word is defined, a new reference is added.

Like in LXR 0.3, the database structure is kept very simple so that queries don’t have much work to do at runtime, thus speeding them up. In particular, we store references to a particular identifier as a simple list, which can be loaded and parsed very fast. The main difference with LXR is that our list includes references from every blob in the project, so we need to restrict it first to only the blobs that are part of the current version. This is done at runtime, simply by computing the intersection of this list with the list of blobs inside the current version.

Finally, we kept the user interface code clearly segregated from the engine itself by making these two modules communicate through a Unix command-line interface. This means that you can run queries directly on the command-line without going through the web interface.

Elixir code example

Elixir code example

What’s next?

Our current focus is on improving multi-project support. In particular, each project has its own quirky way of using Git tags, which needs to be handled individually.

At the user-interface level, we are evaluating the possibility of having auto-completion and/or fuzzy search of identifier names. Also, we are looking for a way to provide direct line-level access to references even in the case of very common identifiers.

On the performance front, we would like to cut the indexation time by switching to a new database back-end that provides efficient appending to large records. Also, we could make source code queries faster by precomputing the references, which would also allow us to eliminate identifier “bleeding” between versions (the case where an identifier shows up as “defined in 0 files” because it is only defined in another version).

If you think of other ways we could improve our service, don’t hesitate to drop us a feature request or a patch!

Bonus: why call it “Elixir”?

In the spur of the moment, it seemed like a nice pun on the name “LXR”. But in retrospect, we wish to apologize to the Elixir language team and the community at large for unnecessary namespace pollution.

by Mikael Bouillot at June 13, 2017 07:39 AM

June 09, 2017

Free Electrons

Beyond boot testing: custom tests with LAVA

Since April 2016, we have our own automated testing infrastructure to validate the Linux kernel on a large number of hardware platforms. We use this infrastructure to contribute to the KernelCI project, which tests every day the Linux kernel. However, the tests being done by KernelCI are really basic: it’s mostly booting a basic Linux system and checking that it reaches a shell prompt.

However, LAVA, the software component at the core of this testing infrastructure, can do a lot more than just basic tests.

The need for custom tests

With some of our engineers being Linux maintainers and given all the platforms we need to maintain for our customers, being able to automatically test specific features beyond a simple boot test was a very interesting goal.

In addition, manually testing a kernel change on a large number of hardware platforms can be really tedious. Being able to quickly send test jobs that will use an image you built on your machine can be a great advantage when you have some new code in development that affects more than one board.

We identified two main use cases for custom tests:

  • Automatic tests to detect regression, as does KernelCI, but with more advanced tests, including platform specific tests.
  • Manual tests executed by engineers to validate that the changes they are developing do not break existing features, on all platforms.

Overall architecture

Several tools are needed to run custom tests:

  • The LAVA instance, which controls the hardware platforms to be tested. See our previous blog posts on our testing hardware infrastructrure and software architecture
  • An appropriate root filesystem, that contains the various userspace programs needed to execute the tests (benchmarking tools, validation tools, etc.)
  • A test suite, which contains various scripts executing the tests
  • A custom test tool that glues together the different components

The custom test tool knows all the hardware platforms available and which tests and kernel configurations apply to which hardware platforms. It identifies the appropriate kernel image, Device Tree, root filesystem image and test suite and submits a job to LAVA for execution. LAVA will download the necessary artifacts and run the job on the appropriate device.

Building custom rootfs

When it comes to test specific drivers, dedicated testing, validation or benchmarking tools are sometimes needed. For example, for storage device testing, bonnie++ can be used, while iperf is nice for networking testing. As the default root filesystem used by KernelCI is really minimalist, we need to build our owns, one for each architecture we want to test.

Buildroot is a simple yet efficient tool to generate root filesystems, it is also used by KernelCI to build their minimalist root filesystems. We chose to use it and made custom configuration files to match our needs.

We ended up with custom rootfs built for ARMv4, ARMv5, ARMv7, and ARMv8, that embed for now Bonnie++, iperf, ping (not the Busybox implementation) and other tiny tools that aren’t included in the default Buildroot configuration.

Our Buildroot fork that includes our custom configurations is available as the buildroot-ci Github project (branch ci).

The custom test tool

The custom test tool is the tool that binds the different elements of the overall architecture together.

One of the main features of the tool is to send jobs. Jobs are text files used by LAVA to know what to do with which device. As they are described in LAVA as YAML files (in the version 2 of the API), it is easy to use templates to generate them based on a single model. Some information is quite static such as the device tree name for a given board or the rootfs version to use, but other details change for every job such as the kernel to use or which test to run.

We made a tool able to get the latest kernel images from KernelCI to quickly send jobs without having a to compile a custom kernel image. If the need is to test a custom image that is built locally, the tool is also able to send files to the LAVA server through SSH, to provide a custom kernel image.

The entry point of the tool is ctt.py, which allows to create new jobs, providing a lot of options to define the various aspects of the job (kernel, Device Tree, root filesystem, test, etc.).

This tool is written in Python, and lives in the custom_tests_tool Github project.

The test suite

The test suite is a set of shell scripts that perform tests returning 0 or 1 depending on the result. This test suite is included inside the root filesystem by LAVA as part of a preparation step for each job.

We currently have a small set of tests:

  • boot test, which simply returns 0. Such a test will be successful as soon as the boot succeeds.
  • mmc test, to test MMC storage devices
  • sata test, to test SATA storage devices
  • crypto test, to do some minimal testing of cryptographic engines
  • usb test, to test USB functionality using mass storage devices
  • simple network test, that just validates network connectivity using ping

All those tests only require the target hardware platform itself. However, for more elaborate network tests, we needed to get two devices to interact with each other: the target hardware platform and a reference PC platform. For this, we use the LAVA MultiNode API. It allows to have a test that spans multiple devices, which we use to perform multiple iperf sessions to benchmark the bandwidth. This test has therefore one part running on the target device (network-board) and one part running on the reference PC platform (network-laptop).

Our current test suite is available as the test_suite Github project. It is obviously limited to just a few tests for now, we hope to extend the tests in the near future.

First use case: daily tests

As previously stated, it’s important for us to know about regressions introduced in the upstream kernel. Therefore, we have set up a simple daily cron job that:

  • Sends custom jobs to all boards to validate the latest mainline Linux kernel and latest linux-nextli>
  • Aggregates results from the past 24 hours and sends emails to subscribed addresses
  • Updates a dashboard that displays results in a very simple page

A nice dashboard showing the tests of the Beaglebone Black and the Nitrogen6x.

Second use case: manual tests

The custom test tool ctt.py has a simple command line interface. It’s easy for someone to set it up and send custom jobs. For example:

ctt.py -b beaglebone-black -m network

will start the network test on the BeagleBone Black, using the latest mainline Linux kernel built by KernelCI. On the other hand:

ctt.py -b armada-7040-db armada-8040-db -t mmc --kernel arch/arm64/boot/Image --dtb-folder arch/arm64/boot/dts/

will run the mmc test on the Marvell Armada 7040 and Armada 8040 development boards, using the locally built kernel image and Device Tree.

The result of the job is sent over e-mail when the test has completed.

Conclusion

Thanks to this custom test tool, we now have an infrastructure that leverages our existing lab and LAVA instance to execute more advanced tests. Our goal is now to increase the coverage, by adding more tests, and run them on more devices. Of course, we welcome feedback and contributions!

by Florent Jacquet at June 09, 2017 02:25 PM

May 30, 2017

Bunnie Studios

Name that Ware May 2017

The Ware for May 2017 is shown below.

This is another one where the level difficulty will depend on if I cropped enough detail out of the photo to make it challenging but not impossible. If you do figure this one out quickly, curious to hear which detail tipped you off!

by bunnie at May 30, 2017 08:04 AM

Winner, Name that Ware April 2017

The Ware for April 2017 is an HP 10780A optical receiver. Congrats to Brian for absolutely nailing this one! email me for your prize.

by bunnie at May 30, 2017 08:04 AM

May 29, 2017

Open Hardware Repository

sfp-plus-i2c - SaFariPark now open for public

SaFariPark is not a site to book holidays on the African plains - though with additional personal funding I am willing to add that feature. It is a software tool to read and write the digital interface of SFP/SFP+ transceiver modules. Together with a device to plug in multiple (4) SFP/SFP+ modules, creatively called MultiSFP (see Figure 1), it is a versatile tool for all your SFP needs. MultiSFP and SaFariPark have been developed by Nikhef as part of the ASTERICS program, and all is open hardware/open source.


Figure 1 - MultiSFP front panel

MultiSFP supports a 10 Gigabit capable connection to the electrical interface of each SFP. Via one USB port each SFP I2C bus can be exercised using SaFariPark. The software main window (Figure 2) exposes most functionality, which are:
  • Editing of individual fields in the SFP module
  • Fixing corrupted SFP EEPROM data, recalculating checksums
  • Showing and saving SFP+ sensor data such as TX/RX power and temperature.
  • Selectively copying content of one SFP module to multiple other modules
  • Laser tuning of optical SFP+ modules


Figure 2 - Main window of SaFariPark

Apart from this SaFariPark allows you to dump the entire EEPROM content, and extend the SFP+ EEPROM data dictionary with custom fields using XML. This enables users to add fields for custom or exotic SFP+ modules. As the software is written Java, it has been verified to work on Linux and Windows. Mac has not been tested yet.

More information can be found here: Also see:

by Vincent van Beveren (v.van.beveren@nikhef.nl) at May 29, 2017 03:07 PM

May 28, 2017

Harald Welte

Playing back GSM RTP streams, RTP-HR bugs

Chapter 0: Problem Statement

In an all-IP GSM network, where we use Abis, A and other interfaces within the cellular network over IP transport, the audio of voice calls is transported inside RTP frames. The codec payload in those RTP frames is the actual codec frame of the respective cellular voice codec. In GSM, there are four relevant codecs: FR, HR, EFR and AMR.

Every so often during the (meanwhile many years of ) development of Osmocom cellular infrastructure software it would have been useful to be able to quickly play back the audio for analysis of given issues.

However, until now we didn't have that capability. The reason is relatively simple: In Osmocom, we genally don't do transcoding but simply pass the voice codec frames from left to right. They're only transcoded inside the phones or inside some external media gateway (in case of larger networks).

Chapter 1: GSM Audio Pocket Knife

Back in 2010, when we were very actively working on OsmocomBB, the telephone-side GSM protocol stack implementation, Sylvain Munaut wrote the GSM Audio Pocket Knife (gapk) in order to be able to convert between different formats (representations) of codec frames. In cellular communcations, everyoe is coming up with their own representation for the codec frames: The way they look on E1 as a TRAU frame is completely different from how RTP payload looks like, or what the TI Calypso DSP uses internally, or what a GSM Tester like the Racal 61x3 uses. The differences are mostly about data types used, bit-endinanness as well as padding and headers. And of course those different formats exist for each of the four codecs :/

In 2013 I first added simplistic RTP support for FR-GSM to gapk, which was sufficient for my debugging needs back then. Still, you had to save the decoded PCM output to a file and play that back, or use a pipe into aplay.

Last week, I picked up this subject again and added a long series of patches to gapk:

  • support for variable-length codec frames (required for AMR support)
  • support for AMR codec encode/decode using libopencore-amrnb
  • support of all known RTP payload formats for all four codecs
  • support for direct live playback to a sound card via ALSA

All of the above can now be combined to make GAPK bind to a specified UDP port and play back the RTP codec frames that anyone sends to that port using a command like this:

$ gapk -I 0.0.0.0/30000 -f rtp-amr -A default -g rawpcm-s16le

I've also merged a chance to OsmoBSC/OsmoNITB which allows the administrator to re-direct the voice of any active voice channel towards a user-specified IP address and port. Using that you can simply disconnect the voice stream from its normal destination and play back the audio via your sound card.

Chapter 2: Bugs in OsmoBTS GSM-HR

While going through the exercise of implementing the above extension to gapk, I had lots of trouble to get it to work for GSM-HR.

After some more digging, it seems there are two conflicting specification on how to format the RTP payload for half-rate GSM:

In Osmocom, we claim to implement RFC5993, but it turned out that (at least) osmo-bts-sysmo (for sysmoBTS) was actually implementing the ETSI format instead.

And even worse, osmo-bts-sysmo gets event the ETSI format wrong. Each of the codec parameters (which are unaligned bit-fields) are in the wrong bit-endianness :(

Both the above were coincidentially also discovered by Sylvain Munaut during operating of the 32C3 GSM network in December 2015 and resulted the two following "work around" patches: * HACK for HR * HACK: Fix the bit order in HR frames

Those merely worked around those issues in the rtp_proxy of OsmoNITB, rather than addressing the real issue. That's ok, they were "quick" hacks to get something working at all during a four-day conference. I'm now working on "real" fixes in osmo-bts-sysmo. The devil is of course in the details, when people upgrade one BTS but not the other and want to inter-operate, ...

It yet remains to be investigated how osmo-bts-trx and other osmo-bts ports behave in this regard.

Chapter 3: Conclusions

Most definitely it is once again a very clear sign that more testing is required. It's tricky to see even wih osmo-gsm-tester, as GSM-HR works between two phones or even two instances of osmo-bts-sysmo, as both sides of the implementation have the same (wrong) understanding of the spec.

Given that we can only catch this kind of bug together with the hardware (the DSP runs the PHY code), pure unit tests wouldn't catch it. And the end-to-end test is also not very well suited to it. It seems to call for something in betewen. Something like an A-bis interface level test.

We need more (automatic) testing. I cannot say that often enough. The big challenge is how to convince contributors and customers that they should invest their time and money there, rather than yet-another (not automatically tested) feature?

by Harald Welte at May 28, 2017 10:00 PM

Mirko Vogt, nanl.de

SonOTA – Flashing Itead Sonoff devices via original OTA mechanism

Long story short

There’s now a script with which you can flash your sonoff device via the original internal OTA upgrade mechanism, meaning, no need to open, solder, etc. the device to get your custom firmware onto it.

This isn’t perfect (yet) — please mind the issues at the end of this post!

https://github.com/mirko/SonOTA

Credits

First things first: Credits!
The problem with credits is you usually forget somebody and that’s most likely happening here as well.
I read around quite a lot, gathered information and partially don’t even remember anymore where I read what (first).

Of course I’m impressed by the entire Tasmota project and what it enables one to do with the Itead Sonoff and similar devices.

Special thanks go to khcnz who helped me a lot in a discussion documented here.

I’d also like to mention Richard Burtons, who I didn’t interact with directly but only read his blog. That guy apparently was too bored by all the amazing tech stuff he was doing for a living, so he took a medical degree and is now working as a doctor, has a passion for horology (meaning, he’s building a turrot clock), is sailing regattas with his own rs200, decompiles and reverse-engineers proprietary bootloaders in his spare time and writes a new bootloader called rboot for the ESP8266 as a side project.

EDIT: Jan Almeroth already reversed some of the protocol in 2016 and also documented the communication between the proprietary EWeLink app and the AWS cloud. Unfortunately I only became aware of that great post after I already finished mine.

Introduction Sonoff devices

Quite recently the Itead Sonoff series — a bunch of ESP8266 based IoT homeautomation devices — was brought to my attention.

The ESP8266 is a low-power consumption SoC especially designed for IoT purposes. It’s sold by Espressif, running a 32-Bit processor featuring the Xtensa instruction set (licensed from Tensilica) and having an ASIC IP core and WiFi onboard.

Those Sonoff devices using this SoC basically expect high voltage input, therewith having an AC/DC (5V) converter, the ESP8266 SoC and a relais switching the high voltage output.
They’re sold as wall switches (“Sonoff Touch”), E27 socket adapters (“Slampher”), power sockets (“S20 smart socket”) or as just — that’s most basic cheapest model — all that in a simple case (“Sonoff Basic”).
They also have a bunch of sensoric devices, measuring temperature, power comsumption, humidty, noise levels, fine dust, etc.

Though I’m rather sceptical about the whole IoT (development) philosophy, I always was (and still am) interested into low-cost and power-saving home automation which is completely and exclusively under my control.

That implies I’m obviously not interested in some random IoT devices being necessarily connected to some Google/Amazon/Whatever cloud, even less if sensible data is transmitted without me knowing (but very well suspecting) what it’s used for.

Guess what the Itead Sonoff devices do? Exactly that! They even feature Amazon Alexa and Google Nest support! And of course you have to use their proprietary app to confgure and control your devices — via the Amazon cloud.

However, as said earlier, they’re based on the ESP8266 SoC, around which a great deal of OpenSource projects evolved. For some reason especially the Arduino community pounced on that SoC, enabling a much broader range of people to play around with and program for those devices. Whether that’s a good and/or bad thing is surely debatable.

I’ll spare you the details about all the projects I ran into, there’s plenty of cool stuff out there.

I decided to go for the Sonoff-Tasmota project which is quite actively developed and supports most of the currently available Sonoff devices.

It provides an HTTP and MQTT interface and doesn’t need any connection to the internet at all. As MQTT sever (in MQTT speech called broker) I use mosquitto which I’m running on my OpenWrt WiFi router.

Flashing custom firmware (via serial)

Flashing your custom firmware onto those devices however always requires opening them, soldering a serial cable, pulling GPIO0 down to get the SoC into programming mode (which, depending on the device type, again involes soldering) and then flash your firmware via serial.

Side note: Why do all those projects describing the flashing procedure name an “FTDI serial converter” as a requirement? Every serial TTL converter does the job.
And apart from that FTDI is not a product but a company, it’s a pretty shady one. I’d just like to remind of the “incident” where FTDI released new drivers for their chips which intentionally bricked clones of their converters.

How to manually flash via serial — even though firmware replacement via OTA (kinda) works now, you still might want unbrick or debug your device — the Tasmota wiki provides instructions for each of the supported devices.

Anyway, as I didn’t want to open and solder every device I intend to use, I took a closer look at the original firmware and its OTA update mechanism.

Protocol analysis

First thing after the device is being configured (meaning, the device got configured by the proprietary app and is therewith now having internet access via your local WiFi network) is to resolve the hostname `eu-disp.coolkit.cc` and attempt to establish a HTTPS connection.

Though the connection is SSL, it doesn’t do any server certificate verification — so splitting the SSL connection and *man-in-the-middle it is fairly easy.

As a side effect I ported the mitm project sslsplit to OpenWrt and created a seperate “interception”-network on my WiFi router. Now I only need to join that WiFi network and all SSL connections get split, its payload logged and being provided on an FTP share. Intercepting SSL connections never felt easier.

Back to the protocol: We’re assuming at this point the Sonoff device was already configured (e.g. by the official WeLink app) which means it has joined our WiFi network, acquired IP settings via DHCP and has access to the internet.

The Sonoff device sends a dispatch call as HTTPS POST request to eu-disp.coolkit.cc including some JSON encoded data about itself:


POST /dispatch/device HTTP/1.1
Host: eu-disp.coolkit.cc
Content-Type: application/json
Content-Length: 152

{
  "accept":     "ws;2",
  "version":    2,
  "ts":         119,
  "deviceid":   "100006XXXX",
  "apikey":     "6083157d-3471-4f4c-8308-XXXXXXXXXXXX",
  "model":      "ITA-GZ1-GL",
  "romVersion": "1.5.5"
}

It expects an also JSON encoded host as an answer

HTTP/1.1 200 OK
Server: openresty
Date: Mon, 15 May 2017 01:26:00 GMT
Content-Type: application/json
Content-Length: 55
Connection: keep-alive

{
  "error":  0,
  "reason": "ok",
  "IP":     "52.29.48.55",
  "port":   443
}

which is used to establish a WebSocket connection

GET /api/ws HTTP/1.1
Host: iotgo.iteadstudio.com
Connection: upgrade
Upgrade: websocket
Sec-WebSocket-Key: ITEADTmobiM0x1DaXXXXXX==
Sec-WebSocket-Version: 13


HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: q1/L5gx6qdQ7y3UWgO/TXXXXXXA=

which consecutively will be used for further interchange.
Payload via the established WebSocket channel continues to be encoded in JSON.
The messages coming from the device can be classified into action-requests initiated by the device (which expect ackknowledgements by the server) and acknowledgement messages for requests initiated by the server.

The first requests are action-requests coming from the device:

1) action: register

{
  "userAgent":  "device",
  "apikey":     "6083157d-3471-4f4c-8308-XXXXXXXXXXXX",
  "deviceid":   "100006XXXX",
  "action":     "register",
  "version":    2,
  "romVersion": "1.5.5",
  "model":      "ITA-GZ1-GL",
  "ts":         712
}

responded by the server with
{
  "error":       0,
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "config": {
    "hb":         1,
    "hbInterval": 145
  }
}

As can be seen, action-requests initiated from server side also have an apikey field which can be — as long its used consistently in that WebSocket session — any generated UUID but the one used by the device.

2) action: date

{
  "userAgent":  "device",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "deviceid":   "100006XXXX",
  "action"      :"date"
}

responded with
{
  "error":      0,
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "date":       "2017-05-15T01:26:01.498Z"
}

Pay attention to the date format: it is some kind ISO 8601 but the parser is really picky about it. While python’s datetime.isoformat() function e.g. returns a string taking microseconds into account, the parser on the device will just fail parsing that. It also always expects the actually optional timezone being specified as UTC and only as a trailing Z (though according to the spec “00:00” would be valid as well).

3) action: update — the device tells the server its switch status, the MAC address of the accesspoint it is connected to, signal quality, etc.
This message also appears everytime the device status changes, e.g. it got switched on/off via the app or locally by pressing the button.

{
  "userAgent":      "device",
  "apikey":         "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "deviceid":       "100006XXXX",
  "action":         "update",
  "params": {
    "switch":         "off",
    "fwVersion":      "1.5.5",
    "rssi":           -41,
    "staMac":         "5C:CF:7F:F5:19:F8",
    "startup":        "off"
  }
}

simply acknowlegded with
{
  "error":      0,
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX"
}

4) action: query — the device queries potentially configured timers
{
  "userAgent":  "device",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "deviceid":   "100006XXXX",
  "action":     "query",
  "params": [
    "timers"
  ]
}

as there are no timers configured the answer simply contains a "params":0 KV-pair
{
  "error":      0,
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "params":     0
}

That’s it – that’s the basic handshake after the (configured) device powers up.

Now the server can tell the device to do stuff.

The sequence number is used by the device to acknowledge particular action-requests so the response can be mapped back to the actual request. It appears to be a UNIX timestamp with millisecond precision which doesn’t seem like the best source for generating a sequence number (duplicates, etc.) but seems to work well enough.

Let’s switch the relais:

{
  "action":     "update",
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "userAgent":  "app",
  "sequence":   "1494806715179",
  "ts":         0,
  "params": {
    "switch":     "on"
  },
  "from":       "app"
}

{
  "action":     "update",
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "userAgent":  "app",
  "sequence":   "1494806715193",
  "ts":         0,
  "params": {
    "switch":     "off"
  },
  "from":       "app"
}

As mentioned earlier, each action-request is responded with proper acknowledgements.

And — finally — what the server now also is capable doing is to tell the device to update itself:

{
  "action":     "upgrade",
  "deviceid":   "100006XXXX",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "userAgent":  "app",
  "sequence":   "1494802194654",
  "ts":         0,
  "params": {
    "binList":[
      {
        "downloadUrl":  "http://52.28.103.75:8088/ota/rom/xpiAOwgVUJaRMqFkRBsoI4AVtnozgwp1/user1.1024.new.2.bin",
        "digest":       "1aee969af1daf96f3f120323cd2c167ae1aceefc23052bb0cce790afc18fc634",
        "name":         "user1.bin"
      },
      {
        "downloadUrl":  "http://52.28.103.75:8088/ota/rom/xpiAOwgVUJaRMqFkRBsoI4AVtnozgwp1/user2.1024.new.2.bin",
        "digest":       "6c4e02d5d5e4f74d501de9029c8fa9a7850403eb89e3d8f2ba90386358c59d47",
        "name":         "user2.bin"
      }
    ],
    "model":    "ITA-GZ1-GL",
    "version":  "1.5.5",
  }
}

After successful download and verification of the image’s checksum the device returns:
{
  "error":      0,
  "userAgent":  "device",
  "apikey":     "85036160-aa4a-41f7-85cc-XXXXXXXXXXXX",
  "deviceid":   "100006XXXX",
  "sequence":   "1495932900713"
}

The downloadUrl field should be self-explanatory (the following HTTP GET request to those URLs contain some more data as CGI parameters which however can be ommitted).

The digest is a sha256 hash of the file and the name is the partition onto which the file should be written on.

Implementing server side

After some early approaches I decided to go for a Python implementation using the tornado webserver stack.
This decision was mainly based on it providing functionality for HTTP (obviously) as well as websockets and asynchronous handling of requests.

The final script can be found here: https://github.com/mirko/SonOTA

==> Trial & Error

1st attempt

As user1.1024.new.2.bin and user2.20124.new.2.bin almost look the same, let’s just use the same image for both, in this case a tasmota build:

MOEP! Boot fails.

Reason: The tasmota build also contains the bootloader which the Espressif OTA mechanism doesn’t expect being in the image.

2nd attempt

Chopping off the first 0x1000 bytes which contain the bootloader plus padding (filled up with 0xAA bytes).

MOEP! Boot fails.

Boot mode 1 and 2 / v1 and v2 image headers

The (now chopped) image and the original upgrade images appear to have different headers — even the very first byte (the files’ magic byte) differ.

The original image starts with 0xEA while the Tasmota build starts with 0xE9.

Apparently there are two image formats (called v1 and v2 or boot mode 1 and boot mode 2).
The former (older) one — used by Arduino/Tasmota — starts with 0xE9, while the latter (and apparently newer one) — used by the original firmware — starts with 0xEA.

The technical differences are very well documented by the ESP8266 Reverse Engineering Wiki project, regarding the flash format and the v1/v2 headers in particular the SPI Flash Format wiki oage.

The original bootloader only accepts images starting with 0xEA while the bootloader provided by Arduino/Tasmota only accepts such starting with 0xE9.

3rd attempt

Converting Arduino images to v2 images

Easier said than done, as the Arduino framework doesn’t seem to be capable of creating v2 images and none of the common tools appear to have conversion functionality.

Taking a closer look at the esptool.py project however, there seems to be (undocumented) functionality.
esptool.py has the elf2image argument which — according source — allows switching between conversion to v1 and v2 images.

When using elf2image and also passing the --version parameter — which normally prints out the version string of the tool — the --version parameter gets redefined and expects an then argument: 1 or 2.

Besides the sonoff.ino.bin file the Tasmota project also creates an sonoff.ino.elf which can now be used in conjunction with esptool.py and the elf2image-parameter to create v2 images.

Example: esptool.py elf2image --version 2 tmp/arduino_build_XXXXXX/sonoff.ino.elf

WORKS! MOEP! WORKS! MOEP!

Remember the upgrade-action passed a 2-element list of download URLs to the device, having different names (user1.bin and user2.bin)?

This procedure now only works if the user1.bin image is being fetched and flashed.

Differences between user1.bin and user2.bin

The flash on the Sonoff devices is split into 2 parts (simplified!) which basically contain the same data (user1 and user2). As OTA upgrades are proven to fail sometimes for whatever reason, the upgrade will always happen on the currently inactive part, meaning, if the device is currently running the code from the user1 part, the upgrade will happen onto the user2 part.
That mechanism is not invented by Itead, but actually provided as off-the-shelf OTA solution by Espressif (the SoC manufacturer) itself.

For 1MB flash chips the user1 image is stored at offset 0x01000 while the user2 image is stored at 0x81000.

And indeed, the two original upgrade images (user1 and user2) differ significantly.

If flashing a user2 image onto the user1 part of the flash the device refuses to boot and vice versa.

While there’s not much information about how user1.bin and user2.bin technically differ from each other, khcnz pointed me to an Espressif document stating:

user1.bin and user2.bin are [the] same software placed to different regions of [the] flash. The only difference is [the] address mapping on flash.

4th attempt

So apparently those 2 images must be created differently indeed.

Again it was khcnz who pointed me to different linker scripts used for each image within the original SDK.
Diffing
https://github.com/espressif/ESP8266_RTOS_SDK/blob/master/ld/eagle.app.v6.new.1024.app1.ld
and
https://github.com/espressif/ESP8266_RTOS_SDK/blob/master/ld/eagle.app.v6.new.1024.app2.ld
reveals that the irom0_0_seg differs (org = 0x40100000 vs. org = 0x40281010).

As Tasmota doesn’t make use of the user1-/user2-ping-pong mechanism it conly creates images supposed to go to 0x1000 (=user1-partition).

So for creating an user2.bin image — in our case for a device having a 1MB flash chip and allocating (only) 64K for SPIFFS — we have to modify the following linker script accordingly:

--- a/~/.arduino15/packages/esp8266/hardware/esp8266/2.3.0/tools/sdk/ld/eagle.flash.1m64.ld
+++ b/~/.arduino15/packages/esp8266/hardware/esp8266/2.3.0/tools/sdk/ld/eagle.flash.1m64.ld
@@ -7,7 +7,7 @@ MEMORY
   dport0_0_seg :                        org = 0x3FF00000, len = 0x10
   dram0_0_seg :                         org = 0x3FFE8000, len = 0x14000
   iram1_0_seg :                         org = 0x40100000, len = 0x8000
-  irom0_0_seg :                         org = 0x40201010, len = 0xf9ff0
+  irom0_0_seg :                         org = 0x40281010, len = 0xf9ff0
 }
 
 PROVIDE ( _SPIFFS_start = 0x402FB000 );

So we will now create an user1 (without above applied modification> and an user2 (with above modification> image and converting them to v2 images with esptool.py as described above.

–> WORKS!

Depending on whether the original firmware was loaded from the user1 or user2 partition, it will fetch and flash the other image, telling the bootloader afterwards to change the active partition.

Issues

Mission accomplished? Not just yet…

Although our custom firmware is now flashed via the original OTA mechanism and running, the final setup differs in 2 major aspects (compared to if we would have flashed the device via serial):

  • The bootloader is still the original one
  • Our custom image might have ended up in the user2 partition

Each point alone already results in the Tasmota/Adruino OTA mechniasm not working.
Additionally — since the bootloader stays the original one — it still only expects v2 images and still messes with us with its ping-pong-mechanism.

This issue is already being addressed though and discussed on how to be solved best in the issue ticket mentioned at the very beginning.

Happy hacking!

by mirko at May 28, 2017 08:05 PM

May 26, 2017

Open Hardware Repository

Hdlmake - HDLMake version 3.0 promoted to Master

HDLMake 3.0

After a massive refactoring & upgrade process, we have finally published the brand-new HDLMake 3.0 version. This version not only sports a whole set of new features, but has been carefully crafted so that the source code providing a common interface for the growing set of supported tools can be easily maintained.

New Features

These are some of the highlighted features for the new HDLMake v3.0 Release:

  • Updated HDL code parser and solver: the new release includes by default the usage of an embedded HDL code parser and file dependency solver to manage the synthesis and simulation process in an optimal way.
  • Support for Python 3.x: the new release supports both Python2.7 and Python3.x deployments in a single source code branch, enabling an easier integration into newer O.S. distributions.
  • Native support for Linux & Windows shells: The new release not only supports Linux shells as the previous ones, but features native support too for Windows shells such as the classic CMD promt or the new PowerShell.
  • TCL based Makefiles: in order to streamline the process of supporting as many tools as possible in a hierarchical way, in a changing world and rapidly evolving world of FPGA technology and tool providers, we have adopted TCL as the common language layer used by the generated synthesis Makefiles.
  • Proper packaging: from the HDLMake 3.0 onwards, the source code is distributed as a Python package, what allows for a much cleaner installation procedure.

More info

You can find more info about the HDLMake 3.0 version in the following links:

by Javier D. Garcia-Lasheras (jgarcia@gl-research.com) at May 26, 2017 01:35 PM

May 23, 2017

Harald Welte

Power-cycling a USB port should be simple, right?

Every so often I happen to be involved in designing electronics equipment that's supposed to run reliably remotely in inaccessible locations,without any ability for "remote hands" to perform things like power-cycling or the like. I'm talking about really remote locations, possible with no but limited back-haul, and a very high cost of ever sending somebody there for remote maintenance.

Given that a lot of computer peripherals (chips, modules, ...) use USB these days, this is often some kind of an embedded ARM (rarely x86) SoM or SBC, which is hooked up to a custom board that contains a USB hub chip as well as a line of peripherals.

One of the most important lectures I've learned from experience is: Never trust reset signals / lines, always include power-switching capability. There are many chips and electronics modules available on the market that have either no RESET, or even might claim to have a hardware RESET line which you later (painfully) discover just to be a GPIO polled by software which can get stuck, and hence no way to really hard-reset the given component.

In the case of a USB-attached device (even though the USB might only exist on a circuit board between two ICs), this is typically rather easy: The USB hub is generally capable of switching the power of its downstream ports. Many cheap USB hubs don't implement this at all, or implement only ganged switching, but if you carefully select your USB hub (or in the case of a custom PCB), you can make sure that the given USB hub supports individual port power switching.

Now the next step is how to actually use this from your (embedded) Linux system. It turns out to be harder than expected. After all, we're talking about a standard feature that's present in the USB specifications since USB 1.x in the late 1990ies. So the expectation is that it should be straight-forward to do with any decent operating system.

I don't know how it's on other operating systems, but on Linux I couldn't really find a proper way how to do this in a clean way. For more details, please read my post to the linux-usb mailing list.

Why am I running into this now? Is it such a strange idea? I mean, power-cycling a device should be the most simple and straight-forward thing to do in order to recover from any kind of "stuck state" or other related issue. Logical enabling/disabling of the port, resetting the USB device via USB protocol, etc. are all just "soft" forms of a reset which at best help with USB related issues, but not with any other part of a USB device.

And in the case of e.g. an USB-attached cellular modem, we're actually talking about a multi-processor system with multiple built-in micro-controllers, at least one DSP, an ARM core that might run another Linux itself (to implement the USB gadget), ... - certainly enough complex software that you would want to be able to power-cycle it...

I'm curious what the response of the Linux USB gurus is.

by Harald Welte at May 23, 2017 10:00 PM

Open Hardware Repository

Yet Another Micro-controller - YAM first release at OHR

YAM release V1.4 is now available from CERN OHR.

The core has been already used in a number of designs at ESRF .

However there is still some pending work.
  • The co-processors have not been fully tested yet.
  • The yamasm assembler doesn't yet support the 3-operand implementation.

by Christian Herve at May 23, 2017 08:41 AM

May 22, 2017

Free Electrons

Introducing lavabo, board remote control software

In two previous blog posts, we presented the hardware and software architecture of the automated testing platform we have created to test the Linux kernel on a large number of embedded platforms.

The primary use case for this infrastructure was to participate to the KernelCI.org testing effort, which tests the Linux kernel every day on many hardware platforms.

However, since our embedded boards are now fully controlled by LAVA, we wondered if we could not only use our lab for KernelCI.org, but also provide remote control of our boards to Free Electrons engineers so that they can access development boards from anywhere. lavabo was born from this idea and its goal is to allow full remote control of the boards as it is done in LAVA: interface with the serial port, control the power supply and provide files to the board using TFTP.

The advantages of being able to access the boards remotely are obvious: allowing engineers working from home to work on their hardware platforms, avoid moving the boards out of the lab and back into the lab each time an engineer wants to do a test, etc.

User’s perspective

From a user’s point of view, lavabo is used through the eponymous command lavabo, which allows to:

  • List the boards and their status
    $ lavabo list
  • Reserve a board for lavabo usage, so that it is no longer used for CI jobs
    $ lavabo reserve am335x-boneblack_01
  • Upload a kernel image and Device Tree blob so that it can be accessed by the board through TFTP
    $ lavabo upload zImage am335x-boneblack.dtb
  • Connect to the serial port of the board
    $ lavabo serial am335x-boneblack_01
  • Reset the power of the board
    $ lavabo reset am335x-boneblack_01
  • Power off the board
    $ lavabo power-off am335x-boneblack_01
  • Release the board, so that it can once again be used for CI jobs
    $ lavabo release am335x-boneblack_01

Overall architecture and implementation

The following diagram summarizes the overall architecture of lavabo (components in green) and how it connects with existing components of the LAVA architecture.

lavabo reuses LAVA tools and configuration files

lavabo reuses LAVA tools and configuration files

A client-server software

lavabo follows the classical client-server model: the lavabo client is installed on the machines of users, while the lavabo server is hosted on the same machine as LAVA. The server-side of lavabo is responsible for calling the right tools directly on the server machine and making the right calls to LAVA’s API. It controls the boards and interacts with the LAVA instance to reserve and release a board.

On the server machine, a specific Unix user is configured, through its .ssh/authorized_keys to automatically spawn the lavabo server program when someone connects. The lavabo client and server interact directly using their stdin/stdout, by exchanging JSON dictionaries. This interaction model has been inspired from the Attic backup program. Therefore, the lavabo server is not a background process that runs permanently like traditional daemons.

Handling serial connection

Exchanging JSON over SSH works fine to allow the lavabo client to provide instructions to the lavabo server, but it doesn’t work well to provide access to the serial ports of the boards. However, ser2net is already used by LAVA and provides a local telnet port for each serial port. lavabo simply uses SSH port-forwarding to redirect those telnet ports to local ports on the user’s machine.

Different ways to connect to the serial

Different ways to connect to the serial

Interaction with LAVA

To use a board outside of LAVA, we have to interact with LAVA to tell him the board cannot be used anymore. We therefore had to work with LAVA developers to add endpoints for putting online (release) and for putting offline (reserve) boards and an endpoint to get the current status of a board (busy, idle or offline) in LAVA’s API.

These additions to the LAVA API are used by the lavabo server to make reserve and release boards, so that there is no conflict between the CI related jobs (such as the ones submitted by KernelCI.org) and the direct use of boards for remote development.

Interaction with the boards

Now that we know how the client and the server interact and also how the server communicates with LAVA, we need a way to know which boards are in the lab, on which port the serial connection of a board is exposed and what are the commands to control the board’s power supply. All this configuration has already been given to LAVA, so lavabo server simply reads the LAVA configuration files.

The last requirement is to provide files to the board, such as kernel images, Device Tree blobs, etc. Indeed, from a network point of view, the boards are located in a different subnet not routed directly to the users machines. LAVA already has a directory accessible through TFTP from the boards which is one of the mechanisms used to serve files to boards. Therefore, the easiest and most obvious way is to send files from the client to the server and move the files to this directory, which we implemented using SFTP.

User authentication

Since the serial port cannot be shared among several sessions, it is essential to guarantee a board can only be used by one engineer at a time. In order to identify users, we have one SSH key per user in the .ssh/authorized_keys file on the server, each associated to a call to the lavabo-server program with a different username.

This allows us to identify who is reserving/releasing the boards, and make sure that serial port access, or requests to power off or reset the boards are done by the user having reserved the board.

For TFTP, the lavabo upload command automatically uploads files into a per-user sub-directory of the TFTP server. Therefore, when a file called zImage is uploaded, the board will access it over TFTP by downloading user/zImage.

Availability and installation

As you could guess from our love for FOSS, lavabo is released under the GNU GPLv2 license in a GitHub repository. Extensive documentation is available if you’re interested in installing lavabo. Of course, patches are welcome!

by Quentin Schulz at May 22, 2017 07:25 AM

May 09, 2017

Free Electrons

Eight channels audio on i.MX7 with PCM3168

Toradex Colibri i.MX7Free Electrons engineer Alexandre Belloni recently worked on a custom carrier board for a Colibri iMX7 system-on-module from Toradex. This system-on-module obviously uses the i.MX7 ARM processor from Freescale/NXP.

While the module includes an SGTL5000 codec, one of the requirements for that project was to handle up to eight audio channels. The SGTL5000 uses I²S and handles only two channels.

I2S

I2S timing diagram from the SGTL5000 datasheet

Thankfully, the i.MX7 has multiple audio interfaces and one is fully available on the SODIMM connector of the Colibri iMX7. A TI PCM3168 was chosen for the carrier board and is connected to the second Synchronous Audio Interface (SAI2) of the i.MX7. This codec can handle up to 8 output channels and 6 input channels. It can take multiple formats as its input but TDM takes the smaller number of signals (4 signals: bit clock, word clock, data input and data output).


TDM timing diagram from the PCM3168 datasheet

The current Linux long term support version is 4.9 and was chosen for this project. It has support for both the i.MX7 SAI (sound/soc/fsl/fsl_sai.c) and the PCM3168 (sound/soc/codecs/pcm3168a.c). That’s two of the three components that are needed, the last one being the driver linking both by describing the topology of the “sound card”. In order to keep the custom code to the minimum, there is an existing generic driver called simple-card (sound/soc/generic/simple-card.c). It is always worth trying to use it unless something really specific prevents that. Using it was as simple as writing the following DT node:

        board_sound {
                compatible = "simple-audio-card";
                simple-audio-card,name = "imx7-pcm3168";
                simple-audio-card,widgets =
                        "Speaker", "Channel1out",
                        "Speaker", "Channel2out",
                        "Speaker", "Channel3out",
                        "Speaker", "Channel4out",
                        "Microphone", "Channel1in",
                        "Microphone", "Channel2in",
                        "Microphone", "Channel3in",
                        "Microphone", "Channel4in";
                simple-audio-card,routing =
                        "Channel1out", "AOUT1L",
                        "Channel2out", "AOUT1R",
                        "Channel3out", "AOUT2L",
                        "Channel4out", "AOUT2R",
                        "Channel1in", "AIN1L",
                        "Channel2in", "AIN1R",
                        "Channel3in", "AIN2L",
                        "Channel4in", "AIN2R";

                simple-audio-card,dai-link@0 {
                        format = "left_j";
                        bitclock-master = &pcm3168_dac>;
                        frame-master = &pcm3168_dac>;
                        frame-inversion;

                        cpu {
                                sound-dai = &sai2>;
                                dai-tdm-slot-num = 8>;
                                dai-tdm-slot-width = 32>;
                        };

                        pcm3168_dac: codec {
                                sound-dai = &pcm3168 0>;
                                clocks = &codec_osc>;
                        };
                };

                simple-audio-card,dai-link@2 {
                        format = "left_j";
                        bitclock-master = &pcm3168_adc>;
                        frame-master = &pcm3168_adc>;

                        cpu {
                                sound-dai = &sai2>;
                                dai-tdm-slot-num = 8>;
                                dai-tdm-slot-width = 32>;
                        };

                        pcm3168_adc: codec {
                                sound-dai = &pcm3168 1>;
                                clocks = &codec_osc>;
                        };
                };
        };

There are multiple things of interest:

  • Only 4 input channels and 4 output channels are routed because the carrier board only had that wired.
  • There are two DAI links because the pcm3168 driver exposes inputs and outputs separately
  • As per the PCM3168 datasheet:
    • left justified mode is used
    • dai-tdm-slot-num is set to 8 even though only 4 are actually used
    • dai-tdm-slot-width is set to 32 because the codec takes 24-bit samples but requires 32 clocks per sample (this is solved later in userspace)
    • The codec is master which is usually best regarding clock accuracy, especially since the various SoMs on the market almost never expose the audio clock on the carrier board interface. Here, a crystal was used to clock the PCM3168.

The PCM3168 codec is added under the ecspi3 node as that is where it is connected:

&ecspi3 {
        pcm3168: codec@0 {
                compatible = "ti,pcm3168a";
                reg = 0>;
                spi-max-frequency = 1000000>;
                clocks = &codec_osc>;
                clock-names = "scki";
                #sound-dai-cells = 1>;
                VDD1-supply = &reg_module_3v3>;
                VDD2-supply = &reg_module_3v3>;
                VCCAD1-supply = &reg_board_5v0>;
                VCCAD2-supply = &reg_board_5v0>;
                VCCDA1-supply = &reg_board_5v0>;
                VCCDA2-supply = &reg_board_5v0>;
        };
};

#sound-dai-cells is what allows to select between the input and output interfaces.

On top of that, multiple issues had to be fixed:

Finally, an ALSA configuration file (/usr/share/alsa/cards/imx7-pcm3168.conf) was written to ensure samples sent to the card are in the proper format, S32_LE. 24-bit samples will simply have zeroes in the least significant byte. For 32-bit samples, the codec will properly ignore the least significant byte.
Also this describes that the first subdevice is the playback (output) device and the second subdevice is the capture (input) device.

imx7-pcm3168.pcm.default {
	@args [ CARD ]
	@args.CARD {
		type string
	}
	type asym
	playback.pcm {
		type plug
		slave {
			pcm {
				type hw
				card $CARD
				device 0
			}
			format S32_LE
			rate 48000
			channels 4
		}
	}
	capture.pcm {
		type plug
		slave {
			pcm {
				type hw
				card $CARD
				device 1
			}
			format S32_LE
			rate 48000
			channels 4
		}
	}
}

On top of that, the dmix and dsnoop ALSA plugins can be used to separate channels.

To conclude, this shows that it is possible to easily leverage existing code to integrate an audio codec in a design by simply writing a device tree snippet and maybe an ALSA configuration file if necessary.

by Alexandre Belloni at May 09, 2017 08:16 AM

May 04, 2017

Free Electrons

Feedback from the Netdev 2.1 conference

At Free Electrons, we regularly work on networking topics as part of our Linux kernel contributions and thus we decided to attend our very first Netdev conference this year in Montreal. With the recent evolution of the network subsystem and its drivers capabilities, the conference was a very good opportunity to stay up-to-date, thanks to lots of interesting sessions.

Eric Dumazet presenting “Busypolling next generation”

The speakers and the Netdev committee did an impressive job by offering such a great schedule and the recorded talks are already available on the Netdev Youtube channel. We particularly liked a few of those talks.

Distributed Switch Architecture – slidesvideo

Andrew Lunn, Viven Didelot and Florian Fainelli presented DSA, the Distributed Switch Architecture, by giving an overview of what DSA is and by then presenting its design. They completed their talk by discussing the future of this subsystem.

DSA in one slide

The goal of the DSA subsystem is to support Ethernet switches connected to the CPU through an Ethernet controller. The distributed part comes from the possibility to have multiple switches connected together through dedicated ports. DSA was introduced nearly 10 years ago but was mostly quiet and only recently came back to life thanks to contributions made by the authors of this talk, its maintainers.

The main idea of DSA is to reuse the available internal representations and tools to describe and configure the switches. Ports are represented as Linux network interfaces to allow the userspace to configure them using common tools, the Linux bridging concept is used for interface bridging and the Linux bonding concept for port trunks. A switch handled by DSA is not seen as a special device with its own control interface but rather as an hardware accelerator for specific networking capabilities.

DSA has its own data plane where the switch ports are slave interfaces and the Ethernet controller connected to the SoC a master one. Tagging protocols are used to direct the frames to a specific port when coming from the SoC, as well as when received by the switch. For example, the RX path has an extra check after netif_receive_skb() so that if DSA is used, the frame can be tagged and reinjected into the network stack RX flow.

Finally, they talked about the relationship between DSA and Switchdev, and cross-chip configuration for interconnected switches. They also exposed the upcoming changes in DSA as well as long term goals.

Memory bottlenecks – slides

As part of the network performances workshop, Jesper Dangaard Brouer presented memory bottlenecks in the allocators caused by specific network workloads, and how to deal with them. The SLAB/SLUB baseline performances are found to be too slow, particularly when using XDP. A way from a driver to solve this issue is to implement a custom page recycling mechanism and that’s what all high-speed drivers do. He then displayed some data to show why this mechanism is needed when targeting the 10G network budget.

Jesper is working on a generic solution called page pool and sent a first RFC at the end of 2016. As mentioned in the cover letter, it’s still not ready for inclusion and was only sent for early reviews. He also made a small overview of his implementation.

DDOS countermeasures with XDP – slides #1slides #2 – video #1video #2

These two talks were given by Gilberto Bertin from Cloudflare and Martin Lau from Facebook. While they were not talking about device driver implementation or improvements in the network stack directly related to what we do at Free Electrons, it was nice to see how XDP is used in production.

XDP, the eXpress Data Path, provides a programmable data path at the lowest point of the network stack by processing RX packets directly out of the drivers’ RX ring queues. It’s quite new and is an answer to lots of userspace based solutions such as DPDK. Gilberto andMartin showed excellent results, confirming the usefulness of XDP.

From a driver point of view, some changes are required to support it. RX hooks must be added as well as some API changes and the driver’s memory model often needs to be updated. So far, in v4.10, only a few drivers are supporting XDP.

XDP MythBusters – slides – video

David S. Miller, the maintainer of the Linux networking stack and drivers, did an interesting keynote about XDP and eBPF. The eXpress Data Path clearly was the hot topic of this Netdev 2.1 conference with lots of talks related to the concept and David did a good overview of what XDP is, its purposes, advantages and limitations. He also quickly covered eBPF, the extended Berkeley Packet Filters, which is used in XDP to filter packets.

This presentation was a comprehensive introduction to the concepts introduced by XDP and its different use cases.

Conclusion

Netdev 2.1 was an excellent experience for us. The conference was well organized, the single track format allowed us to see every session on the schedule, and meeting with attendees and speakers was easy. The content was highly technical and an excellent opportunity to stay up-to-date with the latest changes of the networking subsystem in the kernel. The conference hosted both talks about in-kernel topics and their use in userspace, which we think is a very good approach to not focus only on the kernel side but also to be aware of the users needs and their use cases.

by Antoine Ténart at May 04, 2017 08:13 AM

May 02, 2017

Harald Welte

OsmoDevCon 2017 Review

After the public user-oriented OsmoCon 2017, we also recently had the 6th incarnation of our annual contributors-only Osmocom Developer Conference: The OsmoDevCon 2017.

This is a much smaller group, typically about 20 people, and is limited to actual developers who have a past record of contributing to any of the many Osmocom projects.

We had a large number of presentation and discussions. In fact, so large that the schedule of talks extended from 10am to midnight on some days. While this is great, it also means that there was definitely too little time for more informal conversations, chatting or even actual work on code.

We also have such a wide range of topics and scope inside Osmocom, that the traditional ad-hoch scheduling approach no longer seems to be working as it used to. Not everyone is interested in (or has time for) all the topics, so we should group them according to their topic/subject on a given day or half-day. This will enable people to attend only those days that are relevant to them, and spend the remaining day in an adjacent room hacking away on code.

It's sad that we only have OsmoDevCon once per year. Maybe that's actually also something to think about. Rather than having 4 days once per year, maybe have two weekends per year.

Always in motion the future is.

by Harald Welte at May 02, 2017 10:00 PM

Overhyped Docker

Overhyped Docker missing the most basic features

I've always been extremely skeptical of suddenly emerging over-hyped technologies, particularly if they advertise to solve problems by adding yet another layer to systems that are already sufficiently complex themselves.

There are of course many issues with containers, ranging from replicated system libraries and the basic underlying statement that you're giving up on the system packet manager to properly deal with dependencies.

I'm also highly skeptical of FOSS projects that are primarily driven by one (VC funded?) company. Especially if their offering includes a so-called cloud service which they can stop to operate at any given point in time, or (more realistically) first get everybody to use and then start charging for.

But well, despite all the bad things I read about it over the years, on one day in May 2017 I finally thought let's give it a try. My problem to solve as a test balloon is fairly simple.

My basic use case

The plan is to start OsmoSTP, the m3ua-testtool and the sua-testtool, which both connect to OsmoSTP. By running this setup inside containers and inside an internal network, we could then execute the entire testsuite e.g. during jenkins test without having IP address or port number conflicts. It could even run multiple times in parallel on one buildhost, verifying different patches as part of the continuous integration setup.

This application is not so complex. All it needs is three containers, an internal network and some connections in between. Should be a piece of cake, right?

But enter the world of buzzword-fueled web-4000.0 software-defined virtualised and orchestrated container NFW + SDN vodoo: It turns out to be impossible, at least not with the preferred tools they advertise.

Dockerfiles

The part that worked relatively easily was writing a few Dockerfiles to build the actual containers. All based on debian:jessie from the library.

As m3ua-testsuite is written in guile, and needs to build some guile plugin/extension, I had to actually include guile-2.0-dev and other packages in the container, making it a bit bloated.

I couldn't immediately find a nice example Dockerfile recipe that would allow me to build stuff from source outside of the container, and then install the resulting binaries into the container. This seems to be a somewhat weak spot, where more support/infrastructure would be helpful. I guess the idea is that you simply install applications via package feeds and apt-get. But I digress.

So after some tinkering, I ended up with three docker containers:

  • one running OsmoSTP
  • one running m3ua-testtool
  • one running sua-testtool

I also managed to create an internal bridged network between the containers, so the containers could talk to one another.

However, I have to manually start each of the containers with ugly long command line arguments, such as docker run --network sigtran --ip 172.18.0.200 -it osmo-stp-master. This is of course sub-optimal, and what Docker Services + Stacks should resolve.

Services + Stacks

The idea seems good: A service defines how a given container is run, and a stack defines multiple containers and their relation to each other. So it should be simple to define a stack with three services, right?

Well, it turns out that it is not. Docker documents that you can configure a static ipv4_address [1] for each service/container, but it seems related configuration statements are simply silently ignored/discarded [2], [3], [4].

This seems to be related that for some strange reason stacks can (at least in later versions of docker) only use overlay type networks, rather than the much simpler bridge networks. And while bridge networks appear to support static IP address allocations, overlay apparently doesn't.

I still have a hard time grasping that something that considers itself a serious product for production use (by a company with estimated value over a billion USD, not by a few hobbyists) that has no support for running containers on static IP addresses. that. How many applications out there have I seen that require static IP address configuration? How much simpler do setups get, if you don't have to rely on things like dynamic DNS updates (or DNS availability at all)?

So I'm stuck with having to manually configure the network between my containers, and manually starting them by clumsy shell scripts, rather than having a proper abstraction for all of that. Well done :/

Exposing Ports

Unrelated to all of the above: If you run some software inside containers, you will pretty soon want to expose some network services from containers. This should also be the most basic task on the planet.

However, it seems that the creators of docker live in the early 1980ies, where only TCP and UDP transport protocols existed. They seem to have missed that by the late 1990ies to early 2000s, protocols like SCTP or DCCP were invented.

But yet, in 2017, Docker chooses to

Now some of the readers may think 'who uses SCTP anyway'. I will give you a straight answer: Everyone who has a mobile phone uses SCTP. This is due to the fact that pretty much all the connections inside cellular networks (at least for 3G/4G networks, and in reality also for many 2G networks) are using SCTP as underlying transport protocol, from the radio access network into the core network. So every time you switch your phone on, or do anything with it, you are using SCTP. Not on your phone itself, but by all the systems that form the network that you're using. And with the drive to C-RAN, NFV, SDN and all the other buzzwords also appearing in the Cellular Telecom field, people should actually worry about it, if they want to be a part of the software stack that is used in future cellular telecom systems.

Summary

After spending the better part of a day to do something that seemed like the most basic use case for running three networked containers using Docker, I'm back to step one: Most likely inventing some custom scripts based on unshare to run my three test programs in a separate network namespace for isolated execution of test suite execution as part of a Jenkins CI setup :/

It's also clear that Docker apparently don't care much about playing a role in the Cellular Telecom world, which is increasingly moving away from proprietary and hardware-based systems (like STPs) to virtualised, software-based systems.

[1]https://docs.docker.com/compose/compose-file/#ipv4address-ipv6address
[2]https://forums.docker.com/t/docker-swarm-1-13-static-ips-for-containers/28060
[3]https://github.com/moby/moby/issues/31860
[4]https://github.com/moby/moby/issues/24170

by Harald Welte at May 02, 2017 10:00 PM

Free Electrons

Linux 4.11, Free Electrons contributions

Linus Torvalds has released this Sunday Linux 4.11. For an overview of the new features provided by this new release, one can read the coverage from LWN: part 1, part 2 and part 3. The KernelNewbies site also has a detailed summary of the new features.

With 137 patches contributed, Free Electrons is the 18th contributing company according to the Kernel Patch Statistics. Free Electrons engineer Maxime Ripard appears in the list of top contributors by changed lines in the LWN statistics.

Our most important contributions to this release have been:

  • Support for Atmel platforms
    • Alexandre Belloni improved suspend/resume support for the Atmel watchdog driver, I2C controller driver and UART controller driver. This is part of a larger effort to upstream support for the backup mode of the Atmel SAMA5D2 SoC.
    • Alexandre Belloni also improved the at91-poweroff driver to properly shutdown LPDDR memories.
    • Boris Brezillon contributed a fix for the Atmel HLCDC display controller driver, as well as fixes for the atmel-ebi driver.
  • Support for Allwinner platforms
    • Boris Brezillon contributed a number of improvements to the sunxi-nand driver.
    • Mylène Josserand contributed a new driver for the digital audio codec on the Allwinner sun8i SoC, as well a the corresponding Device Tree changes and related fixes. Thanks to this driver, Mylène enabled audio support on the R16 Parrot and A33 Sinlinx boards.
    • Maxime Ripard contributed numerous improvements to the sunxi-mmc MMC controller driver, to support higher data rates, especially for the Allwinner A64.
    • Maxime Ripard contributed official Device Tree bindings for the ARM Mali GPU, which allows the GPU to be described in the Device Tree of the upstream kernel, even if the ARM kernel driver for the Mali will never be merged upstream.
    • Maxime Ripard contributed a number of fixes for the rtc-sun6i driver.
    • Maxime Ripard enabled display support on the A33 Sinlinx board, by contributing a panel driver and the necessary Device Tree changes.
    • Maxime Ripard continued his clean-up effort, by converting the GR8 and sun5i clock drivers to the sunxi-ng clock infrastructure, and converting the sun5i pinctrl driver to the new model.
    • Quentin Schulz added a power supply driver for the AXP20X and AXP22X PMICs used on numerous Allwinner platforms, as well as numerous Device Tree changes to enable it on the R16 Parrot and A33 Sinlinx boards.
  • Support for Marvell platforms
    • Grégory Clement added support for the RTC found in the Marvell Armada 7K and 8K SoCs.
    • Grégory Clement added support for the Marvell 88E6141 and 88E6341 Ethernet switches, which are used in the Armada 3700 based EspressoBin development board.
    • Romain Perier enabled the I2C controller, SPI controller and Ethernet switch on the EspressoBin, by contributing Device Tree changes.
    • Thomas Petazzoni contributed a number of fixes to the OMAP hwrng driver, which turns out to also be used on the Marvell 7K/8K platforms for their HW random number generator.
    • Thomas Petazzoni contributed a number of patches for the mvpp2 Ethernet controller driver, preparing the future addition of PPv2.2 support to the driver. The mvpp2 driver currently only supports PPv2.1, the Ethernet controller used on the Marvell Armada 375, and we are working on extending it to support PPv2.2, the Ethernet controller used on the Marvell Armada 7K/8K. PPv2.2 support is scheduled to be merged in 4.12.
  • Support for RaspberryPi platforms
    • Boris Brezillon contributed Device Tree changes to enable the VEC (Video Encoder) on all bcm283x platforms. Boris had previously contributed the driver for the VEC.

In addition to our direct contributions, a number of Free Electrons engineers are also maintainers of various subsystems in the Linux kernel. As part of this maintenance role:

  • Maxime Ripard, co-maintainer of the Allwinner ARM platform, reviewed and merged 85 patches from contributors
  • Alexandre Belloni, maintainer of the RTC subsystem and co-maintainer of the Atmel ARM platform, reviewed and merged 60 patches from contributors
  • Grégory Clement, co-maintainer of the Marvell ARM platform, reviewed and merged 42 patches from contributors
  • Boris Brezillon, maintainer of the MTD NAND subsystem, reviewed and merged 8 patches from contributors

Here is the detailed list of contributions, commit per commit:

by Thomas Petazzoni at May 02, 2017 12:23 PM

May 01, 2017

Harald Welte

Book on Practical GPL Compliance

My former gpl-violations.org colleague Armijn Hemel and Shane Coughlan (former coordinator of the FSFE Legal Network) have written a book on practical GPL compliance issues.

I've read through it (in the bath tub of course, what better place to read technical literature), and I can agree wholeheartedly with its contents. For those who have been involved in GPL compliance engineering there shouldn't be much new - but for the vast majority of developers out there who have had little exposure to the bread-and-butter work of providing complete an corresponding source code, it makes an excellent introductory text.

The book focuses on compliance with GPLv2, which is probably not too surprising given that it's published by the Linux foundation, and Linux being GPLv2.

You can download an electronic copy of the book from https://www.linuxfoundation.org/news-media/research/practical-gpl-compliance

Given the subject matter is Free Software, and the book is written by long-time community members, I cannot help to notice a bit of a surprise about the fact that the book is released in classic copyright under All rights reserved with no freedom to the user.

Considering the sensitive legal topics touched, I can understand the possible motivation by the authors to not permit derivative works. But then, there still are licenses such as CC-BY-ND which prevent derivative works but still permit users to make and distribute copies of the work itself. I've made that recommendation / request to Shane, let's see if they can arrange for some more freedom for their readers.

by Harald Welte at May 01, 2017 10:00 PM

April 30, 2017

Harald Welte

OsmoCon 2017 Review

It's already one week past the event, so I really have to sit down and write some rewview on the first public Osmocom Conference ever: OsmoCon 2017.

The event was a huge success, by all accounts.

  • We've not only been sold out, but we also had to turn down some last minute registrations due to the venue being beyond capacity (60 seats). People traveled from Japan, India, the US, Mexico and many other places to attend.
  • We've had an amazing audience ranging from commercial operators to community cellular operators to professional developers doing work relate to osmocom, academia, IT security crowds and last but not least enthusiasts/hobbyists, with whom the project[s] started.
  • I've received exclusively positive feedback from many attendees
  • We've had a great programme. Some part of it was of introductory nature and probably not too interesting if you've been in Osmocom for a few years. However, the work on 3G as well as the current roadmap was probably not as widely known yet. Also, I really loved to see Roch's talk about Running a commercial cellular network with Osmocom software as well as the talk on Facebook's OpenCellular BTS hardware and the Community Cellular Manager.
  • We have very professional live streaming + video recordings courtesy of the C3VOC team. Thanks a lot for your support and for having the video recordings of all talks online already at the next day after the event.

We also received some requests for improvements, many of which we will hopefully consider before the next Osmocom Conference:

  • have a multiple day event. Particularly if you're traveling long-distance, it is a lot of overhead for a single-day event. We of course fully understand that. On the other hand, it was the first Osmocom Conference, and hence it was a test balloon where it was initially unclear if we'll be able to get a reasonable number of attendees interested at all, or not. And organizing an event with venue and talks for multiple days if in the end only 10 people attend would have been a lot of effort and financial risk. But now that we know there are interested folks, we can definitely think of a multiple day event next time
  • Signs indicating venue details on the last meters. I agree, this cold have been better. The address of the venue was published, but we could have had some signs/posters at the door pointing you to the right meeting room inside the venue. Sorry for that.
  • Better internet connectivity. This is a double-edged sword. Of course we want our audience to be primarily focused on the talks and not distracted :P I would hope that most people are able to survive a one day event without good connectivity, but for sure we will have to improve in case of a multiple-day event in the future

In terms of my requests to the attendees, I only have one

  • Participate in the discussions on the schedule/programme while it is still possible to influence it. When we started to put together the programme, I posted about it on the openbsc mailing list and invited feedback. Still, most people seem to have missed the time window during which talks could have been submitted and the schedule still influenced before finalizing it
  • Register in time. We have had almost no registrations until about two weeks ahead of the event (and I was considering to cancel it), and then suddenly were sold out in the week ahead of the event. We've had people who first booked their tickets, only to learn that the tickets were sold out. I guess we will introduce early bird pricing and add a very expensive last minute ticket option next year in order to increase motivation to register early and thus give us flexibility regarding venue planning.

Thanks again to everyone involved in OsmoCon 2017!

Ok, now, all of you who missed the event: Go to https://media.ccc.de/c/osmocon17 and check out the recordings. Have fun!

by Harald Welte at April 30, 2017 10:00 PM

April 28, 2017

Andrew Zonenberg, Silicon Exposed

Quest for camp stove fuel

For those of you who aren't keeping up with my occasional Twitter/Facebook posts on the subject, I volunteer with a local search and rescue unit. This means that a few times a month I have to grab my gear and run out into the woods on zero notice to find an injured hiker, locate an elderly person with Alzheimer's, or whatever the emergency du jour is.

Since I don't have time to grab fresh food on my way out the door when duty calls, I keep my pack and load-bearing vest stocked with shelf-stable foods like energy bars and surplus military rations. Many missions are short and intense, leaving me no time to eat anything but finger-food items (Clif bars and First Strike Ration sandwiches are my favorites) kept in a vest pocket.

My SAR vest. Weighs about 17 pounds / 7.7 kg once the Camelbak bladder is added.
On the other hand, during longer missions there may be opportunities to make hot food while waiting for a medevac helicopter, ground team with stretcher, etc - and of course there's plenty of time to cook a hot dinner during training weekends. Besides being a convenience, hot food and drink helps us (and the subject) avoid hypothermia so it can be a literal life-saver.

I've been using MRE chemical heaters for this, because they're small, lightweight (20 g / 0.7 oz each), and not too pricey (about $1 each from surplus dealers). Their major flaw is that they don't get all that hot, so during cold weather it's hard to get your food more than lukewarm.

I've used many kinds of camp stoves (propane and white gas primarily) over the course of my camping, but didn't own one small enough to use for SAR. My full 48-hour gear loadout (including water) weighs around 45 pounds / 20 kg, and I really didn't want to add much more to this. The MSR Whisperlite, for example, weighs in at 430 g / 15.2 oz for the stove, fuel pump, and wind shield. Add to this 150 g / 5.25 oz for the fuel bottle, a pot to cook in, and the fuel itself and you're looking at close to 1 kg / 2 pounds all told.

I have an aluminum camp frying pan that, including lid, weighs 121 g / 4.3 oz. It seemed hard to get much lighter for something large enough that you could squeeze an MRE entree into, so I kept it.

After a bit of browsing in the local Wal-Mart, I found a tiny sheet metal folding stove that weighed 112 g / 3.98 oz empty. It's designed to burn pellets of hexamine fuel.

The stove. Ignore the aluminum foil, it was there from a previous experiment.
In my testing it worked pretty well. One pellet brought 250 ml of water from 10C to boiling in six minutes, and held it at a boil for a minute before burning out. The fuel burned fairly cleanly and didn't leave that much soot on the pot either, which was nice.

What's not so nice, however, was the fuel. According to the MSDS, hexamine decomposes upon heating or contact with skin into formaldehyde, which is toxic and carcinogenic. Combustion products include such tasty substances as hydrogen cyanide and ammonia. This really didn't seem like something that I wanted to handle, or burn, in close proximity to food! Thus began my quest for a safer alternative.

My first thought was to use tea light candles, since I already had a case of a hundred for use as fire starters. In my testing, one tea light was able to heat a pot of water from 10C to 30C in a whopping 21 minutes before starting to reach an equilibrium where the pot lost heat as fast as it gained it. I continued the test out to 34 minutes, at which point it was a toasty 36C.

The stove was big enough to fit more than one tea light, so the obvious next step was to put six of them in a 3x2 grid. This heated significantly more, at the 36-minute mark my water measured a respectable 78C.

I figured I was on the right track, but needed to burn more wax per unit time. Some rough calculations suggested that a brick of paraffin wax the size of the stove and about as thick as a tea light contained 1.5 kWh of energy, and would output about 35 W of heat per wick. Assuming 25% energy transfer efficiency, which seemed reasonable based on the temperature data I had measured earlier, I needed to put out around 675 W to bring my pot to a boil in ten minutes. This came out to approximately 20 candle wicks.

I started out by folding a tray out of heavy duty aluminum foil, and reinforcing it on the outside with aluminum foil duct tape. I then bought a pack of tea light wicks on Amazon and attached them to the tray with double-sided tape.
Giant 20-wicked candle before adding wax
I made a water bath on my hot plate and melted a bunch of tea lights in a beaker. I wasn't in the mood to get spattered with hot wax so I wore long-sleeved clothes and a face shield. I was pretty sure that the water bath wouldn't get anywhere near the ignition point of the wax but did the work outside on a concrete patio and had a CO2 fire extinguisher on standby just in case.

Melting wax. Safety first, everyone!
The resulting behemoth of a candle actually looked pretty nice!
20-wick, 700W thermal output candle with tea lights for scale
After I was done and the wax had solidified I put the candle in my stove and lit it off. It took a while to get started (a light breeze kept blowing out one wick or another and I used quite a few matches to get them all lit), but after a while I had a solid flame going. At the six-minute mark my water had reached 37C.

A few minutes later, disaster struck! The pool of molten wax reached the flash point and ignited across the whole surface. At this point I had a massive flame - my pot went from 48 to 82C in two minutes! This translates to 2.6 kW assuming 100% energy transfer efficiency, so actual power output was probably upwards of 5 kW.

I removed the pot (using welding gloves since the flames were licking up the handle) and grabbed a photo of the fireball before thinking about how to extinguish the fire.

Pretty sure this isn't what a stove is supposed to look like
Since I was outside on a non-flammable surface the fire wasn't an immediate safety hazard, but I wanted to put it out non-destructively to preserve evidence for failure analysis. I opted to smother it with a giant candle snuffer that I rapidly folded out of heavy-duty aluminum foil.

The carnage after the fire was extinguished. Note the discolored wax!
It took me a while to clean up the mess - the giant candle had turned tan from incomplete combustion. It had also sprung a leak at some point, spilling a bit of wax out onto my patio.

On top of that, my pot was coal-black from all of the soot the super-rich flame was putting out. My wife wouldn't let it anywhere near the sink so I scrubbed it as best I could in the bathtub, then spent probably 20 minutes scrubbing all of the gray stains off the tub itself.

In order to avoid the time-consuming casting of wax, my next test used a slug of wax from a tea light that I drilled holes in, then inserted four wicks. I covered the top of the candle with aluminum foil tape to reflect heat back up at the pot, in a bid to increase efficiency and keep the melt puddle below the flash point.

Quad-wick tea light
This performed pretty well in my test. It got my pot up to 35C at the 12-minute mark, which was right about where I expected based on the x1 and x6 candle tests, and didn't flash over.

The obvious next step was to make five of them and see if this would work any better. It ignited more easily than the "brick" candle, and reached 83C at the 6-minute mark. Before T+ 7 minutes, however, the glue on the tape had failed from the heat, and the wax flashed. By the time I got the pot out of harm's way the water was boiling and it was covered in soot (again).

This time, it was a little bit breezier and my snuffer failed to exclude enough air to extinguish the flames. I ended up having to blast it with the CO2 extinguisher I had ready for just this situation. It wasn't hard to put out and I only used about two of the ten pounds of gas. (Ironically, I had planned to take the extinguisher in to get serviced the next morning because it was almost due for annual preventive maintenance. I ended up needing a recharge too...)

After cleaning off my pot and stove, and scraping some of the spilled wax off my driveway, it was back to the drawing board. I thought about other potential fuels I had lying around, and several obvious options came to mind.

Testing booze for flammability
I'm not a big drinker but houseguests have resulted in me having a few bottles of liquor around so I tested it out. Jack didn't burn at all, Captain Morgan white rum burned fitfully and left a sugary residue without putting out much heat. 100-proof vodka left a bit of starchy residue and was tricky to light.

A tea light cup full of 99% isopropyl alcohol brought my pot to 75C in five minutes before burning out, but was filthy and left soot everywhere. Hand sanitizer (about 60% ethanol) burned cleanly, but slower and cooler due to the water content - peak temperature of 54C and 12 minute burn time.

Ethanol seemed like a viable fuel if I could get it up to a higher concentration. I wanted to avoid liquid fuels due to difficulty of handling and the risk of spills, but a thick gel that didn't spill easily looked like a good option.

After a bit of research I discovered that calcium acetate (a salt of acetic acid) was very soluble in water, but not in alcohols. When a saturated solution of it in water is added to an alcohol it forms a stiff gel, commonly referred to as a "California snowball" because it burns and has a consistency like wet snow. I don't have any photos of my test handy, but here's a video from somebody else that shows it off nicely.



Two tea light cups full of the stuff brought my pot of water to a boil in 8 minutes, and held it there until burning out just before the 13-minute mark. I also tried boiling a FSR sandwich packet in a half-inch or so of water, and it was deliciously warm by the end. This seemed like a pretty good fuel!


Testing the calcium acetate fuel. I put a lid on the pot after taking this pic.

I filled two film-canister type containers with the calcium acetate + ethanol gel fuel and left it in my SAR pack. As luck would have it, I spent the next day looking for a missing hiker so it spent quite a while bouncing around driving on dirt roads and hiking.

When I got home I was disappointed to see clear liquid inside the bag that my stove and fuel were stored in. I opened the canisters only to find a thin whitish liquid instead of a stiff gel.

It seemed that the calcium acetate gel was not very stable, and over time the calcium acetate particles would precipitate out and the solution would revert to a liquid state. This clearly would not do.

Hand sanitizer seemed like a pretty good fuel other than being underpowered and perfumed, so I went to the grocery store and started looking at ingredient lists. They all seemed pretty similar - ethanol, water, aloe and other moisturizers, perfumes, maybe colorants, and a thickener. The thickener was typically either hydroxyethyl cellulose or a carbomer.

A few minutes on Amazon turned up a bag of Carbomer 940, a polyvinyl carboxy polymer cross-linked with esters of pentaerythritol. It's supposed to produce a viscosity of 45,000 to 70,000 CPS when added to water at 0.5% by weight. I also ordered a second bottle of Reagent Alcohol (90% ethanol / 5% methanol / 5% isopropanol with no bittering agents, ketones, or non-volatile ingredients) since my other one was pretty low after the calcium acetate failure.

Carbomer 940 is fairly acidic (pH 2.7 - 3.3 at 0.5% concentration) in its pure form and gel when neutral or alkaline, so it needs to be neutralized. The recommended base for alcohol-based gels was triethanolamine, so I picked up a bottle of that too.


Preparing to make carbomer-alcohol fuel gel

I made a 50% alcohol-water solution and added an 0.5% mass of carbomer. It didn't seem to fully dissolve, leaving a bunch of goopy chunks in the beaker.


Incompletely dissolved Carbomer 940 in 50/50 water/alcohol
I left it overnight to dissolve, blended it more, and then filtered off any big clumps with a coffee filter. I then added a few drops of triethanolamine, at which point the solution immediately turned cloudy. Upon blending, a rubbery white substance preciptated out of solution and stuck to my stick blender and the sidewalls of the beaker. This was not supposed to happen!


Rubbery goop on the blender head
Precipitate at the bottom of the beaker

I tried everything I could think of - diluting the triethanolamine and adding it slowly to reduce sudden pH changes, lowering the alcohol concentration, and even letting the carbomer sit in solution for a few days before adding the triethanolamine. Nothing worked.

I went back to square one and started reading more papers and watching process demonstration videos from the manufacturer. Eventually I noticed one source that suggested increasing the pH of the water to about 8 *before* adding the carbomer. This worked and gave a beautiful clear gel!

After a bit of tinkering I found a good process: Starting with 100 ml of water, titrate to pH 8 with triethanolamine. Add 1 g of carbomer powder and blend until fully gelled. Add 300 ml of reagent alcohol a bit at a time, mixing thoroughly after each addition. About halfway through adding the alcohol the gel started to get pretty runny so I mixed in a few more drops of triethanolamine and another 500 mg of carbomer powder before mixing in the rest of the alcohol. I had only a little more alcohol left in the bottle (maybe 50 ml) so I stirred that in without bothering to measure.

The resulting gel was quite stiff and held its shape for a little while after pouring, but could still be transferred between containers without muich difficulty.


Tea light can full of my final fuel
I left the beaker of fuel in my garage for several days and shook it around a bit, but saw no evidence of degradation. Since it's basically just turbo-strength hand sanitizer (~78% instead of the usual 30-60%) without all of the perfumes and moisturizers, it should be pretty stable. I had no trouble igniting it down to 10C ambient temperatures, but may find it necessary to mix in some acetone or other low-flash-point fuel to light it reliably in the winter.

The final batch of fuel filled two polypropylene specimen jars perfectly with just a little bit left over for a cooking test.


One of my two fuel jars
One tea light canister held 10.7 g / 0.38 oz of fuel, and I typically use two at a time, so 21.4 / 0.76 oz. One jar thus holds enough fuel for about five cook sessions, which is more than I'd ever need for a SAR mission or weekend camping trip. The final weight of my entire cooking system (stove, one fuel jar, tea light cans, and pot) comes out to 408 g / 14.41 oz, or a bit less than an empty Whisperlite stove (not counting the pot, fuel tank, or fuel)!

The only thing left was to try cooking on it. I squeezed a bacon-cheddar FSR sandwich into my pot, added a bit of water, and put it on top of the stove with two candle cups of fuel.


Nice clean blue flame, barely visible
By the six-minute mark the water was boiling away merrily and a cloud of steam was coming up around the edge of the lid. I took the pot off around 8 minutes and removed my snack.

Munching on my sandwich. You can't tell in this lighting, but the stove is still burning.
For those of you who haven't eaten First Strike Rations, the sandwiches in them are kind of like Hot Pockets or Toaster Strudels, except with a very thick and dense bread rather than a fluffy, flaky one. The fats in the bread are solid at room temperature and liquefy once it gets warm. This significantly softens the texture of the bread and makes it taste a lot better, so reaching this point is generally the primary goal when cooking one.

My sandwich was firmly over that line and tasted very good (for Army food baked two years ago). The bacon could have been a bit warmer, but the stove kept on burning until a bit after the ten-minute mark so I could easily have left it in the boiling water for another two minutes and made it even hotter.

Once I was done eating it was time to clean up. The stove had no visible dirt (beyond what was there from my previous experiments), and the tea light canisters were clean and fairly free of soot except in one or two spots around the edges. Almost no goopy residue was left behind.

Stove after the cook test
The pot was quite clean as well, with no black soot and only a very thin film of discoloration that was thin enough to leave colored interference fringes. Some of this was left over from previous testing, so if this test had been run on a virgin pot there'd be even less residue.

Bottom of the pot after the cook test


Overall, it was a long journey with many false steps, but I now have the ability to cook for myself over a weekend trip in less than a pound of weight, so I'm pretty happy. 

EDIT: A few people have asked to see the raw data from my temperature-vs-time cook tests, so here it is.

Raw data (graph 1)
 
Raw data (graph 2)

by Andrew Zonenberg (noreply@blogger.com) at April 28, 2017 08:37 PM

April 26, 2017

Bunnie Studios

Name that Ware, April 2017

The Ware for April 2017 is shown below.

This is a guest ware, but the contributor shall remain anonymous per request. Thank you for the contribution, you know who you are!

by bunnie at April 26, 2017 05:46 PM

Winner, Name that Ware March 2017

The ware for March 2017 seems to be a Schneider ATV61 industrial variable speed drive controller. As rasz_pl pointed out, I left the sticker unredacted. I had a misgiving about hiding it fearing the ware would be unguessable, but leaving it in made it perhaps a bit too easy. Prize goes to rasz_pl for being the first to guess, email me for your prize!

by bunnie at April 26, 2017 05:46 PM

April 25, 2017

Altus Metrum

TeleMini3

TeleMini V3.0 Dual-deploy altimeter with telemetry now available

TeleMini v3.0 is an update to our original TeleMini v1.0 flight computer. It is a miniature (1/2 inch by 1.7 inch) dual-deploy flight computer with data logging and radio telemetry. Small enough to fit comfortably in an 18mm tube, this powerful package does everything you need on a single board:

  • 512kB on-board data logging memory, compared with 5kB in v1.

  • 40mW, 70cm ham-band digital transceiver for in-flight telemetry and on-the-ground configuration, compared to 10mW in v1.

  • Transmitted telemetry includes altitude, speed, acceleration, flight state, igniter continutity, temperature and battery voltage. Monitor the state of the rocket before, during and after flight.

  • Radio direction finding beacon transmitted during and after flight. This beacon can be received with a regular 70cm Amateur radio receiver.

  • Barometer accurate to 100k' MSL. Reliable apogee detection, independent of flight path. Barometric data recorded on-board during flight. The v1 boards could only fly to 45k'.

  • Dual-deploy with adjustable apogee delay and main altitude. Fires standard e-matches and Q2G2 igniters.

  • 0.5” x 1.7”. Fits easily in an 18mm tube. This is slightly longer than the v1 boards to provide room for two extra mounting holes past the pyro screw terminals.

  • Uses rechargeable Lithium Polymer battery technology. All-day power in a small and light-weight package.

  • Learn more at http://www.altusmetrum.org/TeleMini/

  • Purchase these at http://shop.gag.com/home-page/telemini-v3.html

I don't have anything in these images to show just how tiny this board is—but the spacing between the screw terminals is 2.54mm (0.1in), and the whole board is only 13mm wide (1/2in).

This was a fun board to design. As you might guess from the version number, we made a couple prototypes of a version 2 using the same CC1111 SoC/radio part as version 1 but in the EasyMini form factor (0.8 by 1.5 inches). Feed back from existing users indicated that bigger wasn't better in this case, so we shelved that design.

With the availability of the STM32F042 ARM Cortex-M0 part in a 4mm square package, I was able to pack that, the higher power CC1200 radio part, a 512kB memory part and a beeper into the same space as the original TeleMini version 1 board. There is USB on the board, but it's only on some tiny holes, along with the cortex SWD debugging connection. I may make some kind of jig to gain access to that for configuration, data download and reprogramming.

For those interested in an even smaller option, you could remove the screw terminals and battery connector and directly wire to the board, and replace the beeper with a shorter version. You could even cut the rear mounting holes off to make the board shorter; there are no components in that part of the board.

by keithp's rocket blog at April 25, 2017 04:01 PM

April 18, 2017

Open Hardware Repository

White Rabbit - 18-04-17: clean that fibre and SFP!

The White Rabbit team at CERN organised a short course about fibre-optic cleaning and inspection.

A special fibre inspection microscope that automatically analyses the image to decide if a cable or SFP passes or fails the norms was demonstrated.
The images of some of the often-used cables and SFP modules that we picked from the development lab, showed clearly traces of grease and dust.

The course showed undoubtedly that fibres should always be inspected and that in almost all cases they should be cleaned before plugging in.
One should not forget to inspect and clean the SFP side either!

The slides of this Fibre Cleaning and Inspection course are available via the OHWR Electronics Design project.

Thanks to Amin Shoaie from CERN's EN-EL group for making this course available.
Note that this course and the practical exercises will be repeated at CERN in the last week of April. Please contact us if you are interested.

Click on image to see the course (pdf, 711kB)

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at April 18, 2017 05:49 PM

April 16, 2017

Harald Welte

Things you find when using SCTP on Linux

Observations on SCTP and Linux

When I was still doing Linux kernel work with netfilter/iptables in the early 2000's, I was somebody who actually regularly had a look at the new RFCs that came out. So I saw the SCTP RFCs, SIGTRAN RFCs, SIP and RTP, etc. all released during those years. I was quite happy to see that for new protocols like SCTP and later DCCP, Linux quickly received a mainline implementation.

Now most people won't have used SCTP so far, but it is a protocol used as transport layer in a lot of telecom protocols for more than a decade now. Virtually all protocols that have traditionally been spoken over time-division multiplex E1/T1 links have been migrated over to SCTP based protocol stackings.

Working on various Open Source telecom related projects, i of course come into contact with SCTP every so often. Particularly some years back when implementing the Erlang SIGTAN code in erlang/osmo_ss7 and most recently now with the introduction of libosmo-sigtran with its OsmoSTP, both part of the libosmo-sccp repository.

I've also hard to work with various proprietary telecom equipment over the years. Whether that's some eNodeB hardware from a large brand telecom supplier, or whether it's a MSC of some other vendor. And they all had one thing in common: Nobody seemed to use the Linux kernel SCTP code. They all used proprietary implementations in userspace, using RAW sockets on the kernel interface.

I always found this quite odd, knowing that this is the route that you have to take on proprietary OSs without native SCTP support, such as Windows. But on Linux? Why? Based on rumors, people find the Linux SCTP implementation not mature enough, but hard evidence is hard to come by.

As much as it pains me to say this, the kind of Linux SCTP bugs I have seen within the scope of our work on Osmocom seem to hint that there is at least some truth to this (see e.g. https://bugzilla.redhat.com/show_bug.cgi?id=1308360 or https://bugzilla.redhat.com/show_bug.cgi?id=1308362).

Sure, software always has bugs and will have bugs. But we at Osmocom are 10-15 years "late" with our implementations of higher-layer protocols compared to what the mainstream telecom industry does. So if we find something, and we find it even already during R&D of some userspace code, not even under load or in production, then that seems a bit unsettling.

One would have expected, with all their market power and plenty of Linux-based devices in the telecom sphere, why did none of those large telecom suppliers invest in improving the mainline Linux SCTP code? I mean, they all use UDP and TCP of the kernel, so it works for most of the other network protocols in the kernel, but why not for SCTP? I guess it comes back to the fundamental lack of understanding how open source development works. That it is something that the given industry/user base must invest in jointly.

The leatest discovered bug

During the last months, I have been implementing SCCP, SUA, M3UA and OsmoSTP (A Signal Transfer Point). They were required for an effort to add 3GPP compliant A-over-IP to OsmoBSC and OsmoMSC.

For quite some time I was seeing some erratic behavior when at some point the STP would not receive/process a given message sent by one of the clients (ASPs) connected. I tried to ignore the problem initially until the code matured more and more, but the problems remained.

It became even more obvious when using Michael Tuexen's m3ua-testtool, where sometimes even the most basic test cases consisting of sending + receiving a single pair of messages like ASPUP -> ASPUP_ACK was failing. And when the test case was re-tried, the problem often disappeared.

Also, whenever I tried to observe what was happening by meas of strace, the problem would disappear completely and never re-appear until strace was detached.

Of course, given that I've written several thousands of lines of new code, it was clear to me that the bug must be in my code. Yesterday I was finally prepare to accept that it might actually be a Linux SCTP bug. Not being able to reproduce that problem on a FreeBSD VM also pointed clearly into this direction.

Now I could simply have collected some information and filed a bug report (which some kernel hackers at RedHat have thankfully invited me to do!), but I thought my use case was too complex. You would have to compile a dozen of different Osmocom libraries, configure the STP, run the scheme-language m3ua-testtool in guile, etc. - I guess nobody would have bothered to go that far.

So today I tried to implement a test case that reproduced the problem in plain C, without any external dependencies. And for many hours, I couldn't make the bug to show up. I tried to be as close as possible to what was happening in OsmoSTP: I used non-blocking mode on client and server, used the SCTP_NODELAY socket option, used the sctp_rcvmsg() library wrapper to receive events, but the bug was not reproducible.

Some hours later, it became clear that there was one setsockopt() in OsmoSTP (actually, libosmo-netif) which enabled all existing SCTP events. I did this at the time to make sure OsmoSTP has the maximum insight possible into what's happening on the SCTP transport layer, such as address fail-overs and the like.

As it turned out, adding that setsockopt for SCTP_FLAGS to my test code made the problem reproducible. After playing around which of the flags, it seems that enabling the SENDER_DRY_EVENT flag makes the bug appear.

You can find my detailed report about this issue in https://bugzilla.redhat.com/show_bug.cgi?id=1442784 and a program to reproduce the issue at http://people.osmocom.org/laforge/sctp-nonblock/sctp-dry-event.c

Inside the Osmocom world, luckily we can live without the SENDER_DRY_EVENT and a corresponding work-around has been submitted and merged as https://gerrit.osmocom.org/#/c/2386/

With that work-around in place, suddenly all the m3ua-testtool and sua-testtool test cases are reliably green (PASSED) and OsmoSTP works more smoothly, too.

What do we learn from this?

Free Software in the Telecom sphere is getting too little attention. This is true even those small portions of telecom relevant protocols that ended up in the kernel like SCTP or more recently the GTP module I co-authored. They are getting too little attention in development, even more lack of attention in maintenance, and people seem to focus more on not using it, rather than fixing and maintaining what is there.

It makes me really sad to see this. Telecoms is such a massive industry, with billions upon billions of revenue for the classic telecom equipment vendors. Surely, they would be able to co-invest in some basic infrastructure like proper and reliable testing / continuous integration for SCTP. More recently, we see millions and more millions of VC cash burned by buzzword-flinging companies doing "NFV" and "SDN". But then rather reimplement network stacks in userspace than to fix, complete and test those little telecom infrastructure components which we have so far, like the SCTP protocol :(

Where are the contributions to open source telecom parts from Ericsson, Nokia (former NSN), Huawei and the like? I'm not even dreaming about the actual applications / network elements, but merely the maintenance of something as basic as SCTP. To be fair, Motorola was involved early on in the Linux SCTP code, and Huawei contributed a long series of fixes in 2013/2014. But that's not the kind of long-term maintenance contribution that one would normally expect from the primary interest group in SCTP.

Finally, let me thank to the Linux SCTP maintainers. I'm not complaining about them! They're doing a great job, given the arcane code base and the fact that they are not working for a company that has SCTP based products as their core business. I'm sure the would love more support and contributions from the Telecom world, too.

by Harald Welte at April 16, 2017 10:00 PM

April 09, 2017

Harald Welte

SIGTRAN/SS7 stack in libosmo-sigtran merged to master

As I blogged in my blog post in Fabruary, I was working towards a more fully-featured SIGTRAN stack in the Osmocom (C-language) universe.

The trigger for this is the support of 3GPP compliant AoIP (with a BSSAP/SCCP/M3UA/SCTP protocol stacking), but it is of much more general nature.

The code has finally matured in my development branch(es) and is now ready for mainline inclusion. It's a series of about 77 (!) patches, some of which already are the squashed results of many more incremental development steps.

The result is as follows:

  • General SS7 core functions maintaining links, linksets and routes
  • xUA functionality for the various User Adaptations (currently SUA and M3UA supported)
    • MTP User SAP according to ITU-T Q.701 (using osmo_prim)
    • management of application servers (AS)
    • management of application server processes (ASP)
    • ASP-SM and ASP-TM state machine for ASP, AS-State Machine (using osmo_fsm)
    • server (SG) and client (ASP) side implementation
    • validated against ETSI TS 102 381 (by means of Michael Tuexen's m3ua-testtool)
    • support for dynamic registration via RKM (routing key management)
    • osmo-stp binary that can be used as Signal Transfer Point, with the usual "Cisco-style" command-line interface that all Osmocom telecom software has.
  • SCCP implementation, with strong focus on Connection Oriented SCCP (as that's what the A interface uses).
    • osmo_fsm based state machine for SCCP connection, both incoming and outgoing
    • SCCP User SAP according to ITU-T Q.711 (osmo_prim based)
    • Interfaces with underlying SS7 stack via MTP User SAP (osmo_prim based)
    • Support for SCCP Class 0 (unit data) and Class 2 (connection oriented)
    • All SCCP + SUA Address formats (Global Title, SSN, PC, IPv4 Address)
    • SCCP and SUA share one implementation, where SCCP messages are transcoded into SUA before processing, and re-encoded into SCCP after processing, as needed.

I have already done experimental OsmoMSC and OsmoHNB-GW over to libosmo-sigtran. They're now all just M3UA clients (ASPs) which connect to osmo-stp to exchange SCCP messages back and for the between them.

What's next on the agenda is to

  • finish my incomplete hacks to introduce IPA/SCCPlite as an alternative to SUA and M3UA (for backwards compatibility)
  • port over OsmoBSC to the SCCP User SAP of libosmo-sigtran
    • validate with SSCPlite lower layer against existing SCCPlite MSCs
  • implement BSSAP / A-interface procedures in OsmoMSC, on top of the SCCP-User SAP.

If those steps are complete, we will have a single OsmoMSC that can talk both IuCS to the HNB-GW (or RNCs) for 3G/3.5G as well as AoIP towards OsmoBSC. We will then have fully SIGTRAN-enabled the full Osmocom stack, and are all on track to bury the OsmoNITB that was devoid of such interfaces.

If any reader is interested in interoperability testing with other implementations, either on M3UA or on SCCP or even on A or Iu interface level, please contact me by e-mail.

by Harald Welte at April 09, 2017 10:00 PM

April 08, 2017

Andrew Zonenberg, Silicon Exposed

STARSHIPRAIDER: Preparing for high-speed I/O characterization

In my previous post, I characterized the STARSHIPRAIDER I/O circuit for high voltage fault transient performance, but was unable to adequately characterize the high speed data performance because my DSO (Rigol DS1102D) only has 100 MHz of bandwidth.

Although I did have some ideas on how to improve the performance of the current I/O circuit, it was already faster than I could measure so I had no way to know if my improvements were actually making it any better. Ideally I'd just buy an oscilloscope with several GHz of bandwidth, but I'm not made of money and those scopes tend to be in the "request a quote" price range.

The obvious solution was to build one. I already had a proven high-speed sampling architecture from my TDR project so all I had to do was repackage it as an oscilloscope and make it faster still.

The circuit was beautifully simple: an output from the FPGA drives a 50 ohm trace to a SMA connector, then a second SMA connector drives the positive input of an ADCMP572 through a 3 dB attenuator (to keep my signal within range). The negative input is driven by a cheap 12-bit I2C DAC. The comparator output is then converted from CML to LVDS and fed to the host FPGA board. Finally, a 3.3V CML output from the FPGA drives the latch enable input on the comparator.

The "ADC" algorithm is essentially the same as on my TDR. I like to think of it as an equivalent-time version of a flash ADC: rather than 256 comparators digitizing the signal once, I digitize the signal 256 times with one comparator (and of course 256 different reference voltages). The post-processing to turn the comparator outputs into 8-bit ADC codes is the same.

Unlike the TDR, however, I also do equivalent-time sampling in the time domain. The FPGA generates the sampling and PRBS clocks with different PLL outputs (at 250 MHz / 4 ns period), and sweeps the relative phase in 100 ps steps to produce an effective resolution of 10 Gsps / 100 ps timebase.

Without further ado here's a picture of the board. Total BOM cost including connectors and PCB was approximately $50.

Oscilloscope board (yes, it's PMOD form factor!)
After some initial firmware development I was able to get some preliminary eye renders off the board. They were, to say the least, not ideal.

250 Mbps: very bumpy rise
500 Mbps: significant eye closure even with increased drive strength

I spent quite a while tracking down other bugs before dealing with the signal integrity issues. For example, a low-frequency pulse train showed up with a very uneven duty cycle:

Duty cycle distortion
Someone suggested that I try a slow rise time pulse to show the distortion more clearly. Not having a proper arbitrary waveform generator, I made do with a squarewave and R-C lowpass filter.

Ever seen breadboarded passives interfacing to edge-launch SMA connectors before?
It appeared that I had jump discontinuities in my waveform every two blocks (color coding)
I don't have an EE degree, but I can tell this looks wrong!

Interestingly enough, two blocks (of 32 samples each) were concatenated into a single JTAG transfer. These two were read in one clock cycle and looked fine, but the junction to the next transfer seemed to be skipping samples.

As it turned out, I had forgotten to clear a flag which led to me reading the waveform data before it was done capturing. Since the circular buffer was rotating in between packets, some samples never got sent.

The next bug required zooming into the waveform a bit to see. The samples captured on the first few (the number seemed to vary across bitstream builds) of my 40 clock phases were showing up shifted by 4 ns (one capture clock).

Horizontally offset samples

I traced this issue to a synchronizer between clock domains having variable latency depending on the phase offset of the source and destination clocks. This is an inherent issue in clock domain crossing, so I think I'm just going to have to calibrate it out somehow. For the short term I'm manually measuring the number of offset phases each time I recompile the FPGA image, and then correcting the data in post-processing.

The final issue was a hardware bug. I was terminating the incoming signal with a 50Ω resistor to ground. Although this had good AC performance, at DC the current drawn from a high-level input was quite significant (66 mA at 3.3V). Since my I/O pins can't drive this much, the line was dragged down.

I decided to rework the input termination to replace the 50Ω terminator with split 100Ω resistors to 3.3V and ground. This should have about half the DC current draw, and is Thevenin equivalent to a 50Ω terminator to 1.65V. As a bonus, the mid-level termination will also allow me to AC-couple the incoming signal if that becomes necessary.

Mill out trace from ground via to on-die 50Ω termination resistor

Remove soldermask from ground via and signal trace

Add 100Ω 0402 low-side terminator
Add 100Ω 0402 high-side terminator, plus jumper trace to 3.3V bulk decoupling cap

Add 10 nF high speed decoupling cap to help compensate for inductance of long feeder trace
I cleaned off all of the flux residue and ran a second set of eye loopback tests at 250 and 500 Mbps. The results were dramatically improved:

Post-rework at 250 Mbps
Post-rework at 500 Mbps
While not perfect, the new eye openings are a lot cleaner. I hope to tweak my input stage further to reduce probing artifacts, but for the time being I think I have sufficient performance to compare multiple STARSHIPRAIDER test circuits and see how they stack up at relatively high speeds.

Next step: collect some baseline data for the current STARSHIPRAIDER characterization board, then use that to inform my v0.2 I/O circuit!

by Andrew Zonenberg (noreply@blogger.com) at April 08, 2017 01:50 AM

March 31, 2017

Bunnie Studios

Name that Ware, March 2017

The Ware for March 2017 is shown below.

I honestly have no idea what this one is from or what it’s for — found it in a junk pile in China. But I was amused by the comically huge QFP, so I snapped a shot of it.

Sorry this is a little late — been ridiculously busy prepping for the launch of a line of new products for Chibitronics, going beta (hopefully) next month.

by bunnie at March 31, 2017 05:52 PM

Winner, Name that Ware February 2017

The Ware for February 2017 is a Data Harvest EcoLog.

A number of people guessed it was a datalogger of some type, but didn’t quite identify the manufacturer or model correctly. That being said, I found Josh Myer’s response an interesting read, so I’ll give the prize to him. Congrats, email me for your prize!

by bunnie at March 31, 2017 05:52 PM

March 30, 2017

Open Hardware Repository

White Rabbit - Tutorial at ICALEPCS conference in October

The ICALEPCS organizing committee will organize a pre-conference workshop on WR in Barcelona in October.

We intend to make this more of a "WR tutorial." and we think there will be something to learn and discuss for everybody: newcomers, casual users and even experts.

Online registration will open on April 17. Registration for the workshop is independent of registration to the conference. If you register, it will be a great pleasure to see you there. Also, please send me comments on the program if you have any. We still have a bit of freedom to change it if need be.

And of course, please forward this to any other people you think could be interested!

Javier

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at March 30, 2017 01:59 PM

March 29, 2017

Free Electrons

Free Electrons at the Netdev 2.1 conference

Netdev 2.1 is the fourth edition of the technical conference on Linux networking. This conference is driven by the community and focus on both the kernel networking subsystems (device drivers, net stack, protocols) and their use in user-space.

This edition will be held in Montreal, Canada, April 6 to 8, and the schedule has been posted recently, featuring amongst other things a talk giving an overview and the current status display of the Distributed Switch Architecture (DSA) or a workshop about how to enable drivers to cope with heavy workloads, to improve performances.

At Free Electrons, we regularly work on networking related topics, especially as part of our Linux kernel contribution for the support of Marvell or Annapurna Labs ARM SoCs. Therefore, we decided to attend our first Netdev conference to stay up-to-date with the network subsystem and network drivers capabilities, and to learn from the community latest developments.

Our engineer Antoine Ténart will be representing Free Electrons at this event. We’re looking forward to being there!

by Antoine Ténart at March 29, 2017 02:35 PM

March 26, 2017

Harald Welte

OsmoCon 2017 Updates: Travel Grants and Schedule

/images/osmocon.png

April 21st is approaching fast, so here some updates. I'm particularly happy that we now have travel grants available. So if the travel expenses were preventing you from attending so far: This excuse is no longer valid!

Get your ticket now, before it is too late. There's a limited number of seats available.

OsmoCon 2017 Schedule

The list of talks for OsmoCon 2017 has been available for quite some weeks, but today we finally published the first actual schedule.

As you can see, the day is fully packed with talks about Osmocom cellular infrastructure projects. We had to cut some talk slots short (30min instead of 45min), but I'm confident that it is good to cover a wider range of topics, while at the same time avoiding fragmenting the audience with multiple tracks.

OsmoCon 2017 Travel Grants

We are happy to announce that we have received donations to permit for providing travel grants!

This means that any attendee who is otherwise not able to cover their travel to OsmoCon 2017 (e.g. because their interest in Osmocom is not related to their work, or because their employer doesn't pay the travel expenses) can now apply for such a travel grant.

For more details see OsmoCon 2017 Travel Grants and/or contact osmocon2017@sysmocom.de.

OsmoCon 2017 Social Event

Tech Talks are nice and fine, but what many people enjoy even more at conferences is the informal networking combined with good food. For this, we have the social event at night, which is open to all attendees.

See more details about it at OsmoCon 2017 Social Event.

by Harald Welte at March 26, 2017 10:00 PM

March 23, 2017

Harald Welte

Upcoming v3 of Open Hardware miniPCIe WWAN modem USB breakout board

Back in October 2016 I designed a small open hardware breakout board for WWAN modems in mPCIe form-factor. I was thinking some other people might be interested in this, and indeed, the first manufacturing batch is already sold out by now.

Instead of ordering more of the old (v2) design, I decided to do some improvements in the next version:

  • add mounting holes so the PCB can be mounted via M3 screws
  • add U.FL and SMA sockets, so the modems are connected via a short U.FL to U.FL cable, and external antennas or other RF components can be attached via SMA. This provides strain relief for the external antenna or cabling and avoids tearing off any of the current loose U.FL to SMA pigtails
  • flip the SIM slot to the top side of the PCB, so it can be accessed even after mounting the board to some base plate or enclosure via the mounting holes
  • more meaningful labeling of the silk screen, including the purpose of the jumpers and the input voltage.

A software rendering of the resulting v3 PCB design files that I just sent for production looks like this:

/images/mpcie-breakout-v3-pcb-rendering.png

Like before, the design of the board (including schematics and PCB layout design files) is available as open hardware under CC-BY-SA license terms. For more information see http://osmocom.org/projects/mpcie-breakout/wiki

It will take some expected three weeks until I'll see the first assembled boards.

I'm also planning to do a M.2 / NGFF version of it, but haven't found the time to get around doing it so far.

by Harald Welte at March 23, 2017 11:00 PM

March 21, 2017

Harald Welte

Osmocom - personal thoughts

As I just wrote in my post about TelcoSecDay, I sometimes worry about the choices I made with Osmocom, particularly when I see all the great stuff people doing in fields that I previously was working in, such as applied IT security as well as Linux Kernel development.

History

When people like Dieter, Holger and I started to play with what later became OpenBSC, it was just for fun. A challenge to master. A closed world to break open and which to attack with the tools, the mindset and the values that we brought with us.

Later, Holger and I started to do freelance development for commercial users of Osmocom (initially basically only OpenBSC, but then OsmoSGSN, OsmoBSC, OsmoBTS, OsmoPCU and all the other bits on the infrastructure side). This lead to the creation of sysmocom in 2011, and ever since we are trying to use revenue from hardware sales as well as development contracts to subsidize and grow the Osmocom projects. We're investing most of our earnings directly into more staff that in turn works on Osmocom related projects.

NOTE

It's important to draw the distinction betewen the Osmocom cellular infrastructure projects which are mostly driven by commercial users and sysmocom these days, and all the many other pure juts-for-fun community projects under the Osmocom umbrella, like OsmocomTETRA, OsmocomGMR, rtl-sdr, etc. I'm focussing only on the cellular infrastructure projects, as they are in the center of my life during the past 6+ years.

In order to do this, I basically gave up my previous career[s] in IT security and Linux kernel development (as well as put things like gpl-violations.org on hold). This is a big price to pay for crating more FOSS in the mobile communications world, and sometimes I'm a bit melancholic about the "old days" before.

Financial wealth is clearly not my primary motivation, but let me be honest: I could have easily earned a shitload of money continuing to do freelance Linux kernel development, IT security or related consulting. There's a lot of demand for related skills, particularly with some experience and reputation attached. But I decided against it, and worked several years without a salary (or almost none) on Osmocom related stuff [as did Holger].

But then, even with all the sacrifices made, and the amount of revenue we can direct from sysmocom into Osmocom development: The complexity of cellular infrastructure vs. the amount of funding and resources is always only a fraction of what one would normally want to have to do a proper implementation. So it's constant resource shortage, combined with lots of unpaid work on those areas that are on the immediate short-term feature list of customers, and that nobody else in the community feels like he wants to work on. And that can be a bit frustrating at times.

Is it worth it?

So after 7 years of OpenBSC, OsmocomBB and all the related projects, I'm sometimes asking myself whether it has been worth the effort, and whether it was the right choice.

It was right from the point that cellular technology is still an area that's obscure and unknown to many, and that has very little FOSS (though Improving!). At the same time, cellular networks are becoming more and more essential to many users and applications. So on an abstract level, I think that every step in the direction of FOSS for cellular is as urgently needed as before, and we have had quite some success in implementing many different protocols and network elements. Unfortunately, in most cases incompletely, as the amount of funding and/or resources were always extremely limited.

Satisfaction/Happiness

On the other hand, when it comes to metrics such as personal satisfaction or professional pride, I'm not very happy or satisfied. The community remains small, the commercial interest remains limited, and as opposed to the Linux world, most players have a complete lack of understanding that FOSS is not a one-way road, but that it is important for all stakeholders to contribute to the development in terms of development resources.

Project success?

I think a collaborative development project (which to me is what FOSS is about) is only then truly successful, if its success is not related to a single individual, a single small group of individuals or a single entity (company). And no matter how much I would like the above to be the case, it is not true for the Osmocom cellular infrastructure projects. Take away Holger and me, or take away sysmocom, and I think it would be pretty much dead. And I don't think I'm exaggerating here. This makes me sad, and after all these years, and after knowing quite a number of commercial players using our software, I would have hoped that the project rests on many more shoulders by now.

This is not to belittle the efforts of all the people contributing to it, whether the team of developers at sysmocom, whether those in the community that still work on it 'just for fun', or whether those commercial users that contract sysmocom for some of the work we do. Also, there are known and unknown donors/funders, like the NLnet foundation for some parts of the work. Thanks to all of you, and clearly we wouldn't be where we are now without all of that!

But I feel it's not sufficient for the overall scope, and it's not [yet] sustainable at this point. We need more support from all sides, particularly those not currently contributing. From vendors of BTSs and related equipment that use Osmocom components. From operators that use it. From individuals. From academia.

Yes, we're making progress. I'm happy about new developments like the Iu and Iuh support, the OsmoHLR/VLR split and 2G/3G authentication that Neels just blogged about. And there's progress on the SIMtrace2 firmware with card emulation and MITM, just as well as there's progress on libosmo-sigtran (with a more complete SUA, M3UA and connection-oriented SCCP stack), etc.

But there are too little people working on this, and those people are mostly coming from one particular corner, while most of the [commercial] users do not contribute the way you would expect them to contribute in collaborative FOSS projects. You can argue that most people in the Linux world also don't contribute, but then the large commercial beneficiaries (like the chipset and hardware makers) mostly do, as are the large commercial users.

All in all, I have the feeling that Osmocom is as important as it ever was, but it's not grown up yet to really walk on its own feet. It may be able to crawl, though ;)

So for now, don't panic. I'm not suffering from burn-out, mid-life crisis and I don't plan on any big changes of where I put my energy: It will continue to be Osmocom. But I also think we have to have a more open discussion with everyone on how to move beyond the current situation. There's no point in staying quiet about it, or to claim that everything is fine the way it is. We need more commitment. Not from the people already actively involved, but from those who are not [yet].

If that doesn't happen in the next let's say 1-2 years, I think it's fair that I might seriously re-consider in which field and in which way I'd like to dedicate my [I would think considerable] productive energy and focus.

by Harald Welte at March 21, 2017 06:00 PM

Returning from TelcoSecDay 2017 / General Musings

I'm just on my way back from the Telecom Security Day 2017 <https://www.troopers.de/troopers17/telco-sec-day/>, which is an invitation-only event about telecom security issues hosted by ERNW back-to-back with their Troopers 2017 <https://www.troopers.de/troopers17/> conference.

I've been presenting at TelcoSecDay in previous years and hence was again invited to join (as attendee). The event has really gained quite some traction. Where early on you could find lots of IT security / hacker crowds, the number of participants from the operator (and to smaller extent also equipment maker) industry has been growing.

The quality of talks was great, and I enjoyed meeting various familiar faces. It's just a pity that it's only a single day - plus I had to head back to Berlin still today so I had to skip the dinner + social event.

When attending events like this, and seeing the interesting hacks that people are working on, it pains me a bit that I haven't really been doing much security work in recent years. netfilter/iptables was at least somewhat security related. My work on OpenPCD / librfid was clearly RFID security oriented, as was the work on airprobe, OsmocomTETRA, or even the EasyCard payment system hack

I have the same feeling when attending Linux kernel development related events. I have very fond memories of working in both fields, and it was a lot of fun. Also, to be honest, I believe that the work in Linux kernel land and the general IT security research was/is appreciated much more than the endless months and years I'm now spending my time with improving and extending the Osmocom cellular infrastructure stack.

Beyond the appreciation, it's also the fact that both the IT security and the Linux kernel communities are much larger. There are more people to learn from and learn with, to engage in discussions and ping-pong ideas. In Osmocom, the community is too small (and I have the feeling, it's actually shrinking), and in many areas it rather seems like I am the "ultimate resource" to ask, whether about 3GPP specs or about Osmocom code structure. What I'm missing is the feeling of being part of a bigger community. So in essence, my current role in the "Open Source Cellular" corner can be a very lonely one.

But hey, I don't want to sound more depressed than I am, this was supposed to be a post about TelcoSecDay. It just happens that attending IT Security and/or Linux Kernel events makes me somewhat gloomy for the above-mentioned reasons.

Meanwhile, if you have some interesting projcets/ideas at the border between cellular protocols/systems and security, I'd of course love to hear if there's some way to get my hands dirty in that area again :)

by Harald Welte at March 21, 2017 05:00 PM

March 16, 2017

Open Hardware Repository

CERN BE-CO-HT contribution to KiCad - Support of free software in public institutions:

At the Octave conference in Geneva the presentation Support of free software in public institutions: the KiCad case will be given by Javier Serrano and Tomasz Wlostowski from CERN.

KiCad is a tool to help electronics designers develop Printed Circuit Boards (PCB). CERN's BE-CO-HT section has been contributing to its development since 2011. These efforts are framed in the context of CERN's activities regarding Open Source Hardware (OSHW), and are meant to provide an environment where design files for electronics can be shared in an efficient way, without the hurdles imposed by the use of proprietary formats.

The talk will start by providing some context about OSHW and the importance of using Free Software tools for sharing design files. We will then move on to a short KiCad tutorial, and finish with some considerations about the role public institutions can play in developing and fostering the use of Free Software, and whether some of the KiCad experience can apply in other contexts.

Access to the presentation: Support of free software in public institutions: the KiCad case

by Erik van der Bij (Erik.van.der.Bij@cern.ch) at March 16, 2017 04:42 PM

March 15, 2017

Bunnie Studios

Looking for Summer Internship in Hardware Hacking?

Tim Ansell (mithro), who has been giving me invaluable advice and support on the NeTV2 project, just had his HDMI (plaintext) video capture project accepted into the Google Summer of Code. This summer, he’s looking for university students who have an interest in learning FPGAs, hacking on video, or designing circuits. To learn more you can check out his post at hdmi2usb.tv.

I’ve learned a lot working with Tim. I also respect his work ethic and he is a steadfast contributor to the open source community. This would be an excellent summer opportunity for any student interested in system-level hardware hacking!

Please note: application deadline is April 3 16:00 UTC.

by bunnie at March 15, 2017 07:47 AM

March 10, 2017

Free Electrons

Free Electrons at the Embedded Linux Conference 2017

Last month, five engineers from Free Electrons participated to the Embedded Linux Conference in Portlan, Oregon. It was once again a great conference to learn new things about embedded Linux and the Linux kernel, and to meet developers from the open-source community.

Free Electrons team at work at ELC 2017, with Maxime Ripard, Antoine Ténart, Mylène Josserand and Quentin Schulz

Free Electrons talks

Free Electrons CEO Michael Opdenacker gave a talk on Embedded Linux Size Reduction techniques, for which the slides and video are available:

Free Electrons engineer Quentin Schulz gave a talk on Power Management Integrated Circuits: Keep the Power in Your Hands, the slides and video are also available:

Free Electrons selection of talks

Of course, the slides from many other talks are progressively being uploaded, and the Linux Foundation published the video recordings in a record time: they are all already available on Youtube!

Below, each Free Electrons engineer who attended the conference has selected one talk he/she has liked, and gives a quick summary of the talk, hopefully to encourage you watch the corresponding video recording.

Using SWupdate to Upgrade your system, Gabriel Huau

Talk selected by Mylène Josserand.

Gabriel Huau from Witekio did a great talk at ELC about SWUpdate, a tool created by Denx to update your system. The talk gives an overview of this tool, how it is working and how to use it. Updating your system is very important for embedded devices to fix some bugs/security fixes or add new features, but in an industrial context, it is sometimes difficult to perform an update: devices not easily accessible, large number of devices and variants, etc. A tool that can update the system automatically or even Over The Air (OTA) can be very useful. SWUpdate is one of them.

SWUpdate allows to update different parts of an embedded system such as the bootloader, the kernel, the device tree, the root file system and also the application data.
It handles different image types: UBI, MTD, Raw, Custom LUA, u-boot environment and even your custom one. It includes a notifier to be able to receive feedback about the updating process which can be useful in some cases. SWUPdate uses different local and OTA/remote interfaces such as USB, SD card, HTTP, etc. It is based on a simple update image format to indicate which images must be updated.

Many customizations can be done with this tool as it is provided with the classic menuconfig configuration tool. One great thing is that this tool is supported by Yocto Project and Buildroot so it can be easily tested.

Do not hesitate to have a look to his slides, the video of his talk or directly test SWUpdate!

GCC/Clang Optimizations for embedded Linux, Khem Raj

Talk selected by Michael Opdenacker.

Khem Raj from Comcast is a frequent speaker at the Embedded Linux Conference, and one of his strong fields of expertise is C compilers, especially LLVM/Clang and Gcc. His talk at this conference can interest anyone developing code in the C language, to know about optimizations that the compilers can use to improve the performance or size of generated binaries. See the video and slides.

Khem Raj slide about compiler optimization optionsOne noteworthy optimization is Clang’s -Oz (Gcc doesn’t have it), which goes even beyond -Os, by disabling loop vectorization. Note that Clang already performs better than Gcc in terms of code size (according to our own measurements). On the topic of bundle optimizations such as -O2 or -Os, Khem added that specific optimizations can be disabled in both compilers through the -fno- command line option preceding the name of a given optimization. The name of each optimization in a given bundle can be found through the -fverbose-asm command line option.

Another new optimization option is -Og, which is different from the traditional -g option. It still allows to produce code that can be debugged, but in a way that provides a reasonable level of runtime performance.

On the performance side, he also recalled the Feedback-Directed Optimizations (FDO), already covered in earlier Embedded Linux Conferences, which can be used to feed the compiler with profiler statistics about code branches. The compiler can use such information to optimize branches which are the more frequent at run-time.

Khem’s last advise was not to optimize too early, and first make sure you do your debugging and profiling work first, as heavily optimized code can be very difficult to debug. Therefore, optimizations are for well-proven code only.

Note that Khem also gave a similar talk in the IoT track for the conference, which was more focused on bare-metal code optimization code and portability: “Optimizing C for microcontrollers” (slides, video).

A Journey through Upstream Atomic KMS to Achieve DP Compliance, Manasi Navare

Talk selected by Quentin Schulz.

This talk was about the journey of a new comer in the mainline kernel community to fix the DisplayPort support in Intel i915 DRM driver. It first presented what happens from the moment we plug a cable in a monitor until we actually see an image, then where the driver is in the kernel: in the DRM subsystem, between the hardware (an Intel Integrated Graphics device) and the libdrm userspace library on which userspace applications such as the X server rely.

The bug to fix was that case when the driver would fail after updating to the requested resolution for a DP link. The other existing drivers usually fail before updating the resolution, so Manasi had to add a way to tell the userspace the DP link failed after updating the resolution. Such addition would be useless without applications using this new information, therefore she had to work with their developers to make the applications behave correctly when reading this important information.

With a working set of patches, she thought she had done most of the work with only the upstreaming left and didn’t know it would take her many versions to make it upstream. She wished to have sent a first version of a driver for review earlier to save time over the whole development plus upstreaming process. She also had to make sure the changes in the userspace applications will be ready when the driver will be upstreamed.

The talk was a good introduction on how DisplayPort works and an excellent example on why involving the community even in early stages of the development process may be a good idea to quicken the overall driver development process by avoiding complete rewriting of some code parts when upstreaming is under way.

See also the video and slides of the talk.

Timekeeping in the Linux Kernel, Stephen Boyd

Talk selected by Maxime Ripard.

Stephen did a great talk about one thing that is often overlooked, and really shouldn’t: Timekeeping. He started by explaining the various timekeeping mechanisms, both in hardware and how Linux use them. That meant covering the counters, timers, the tick, the jiffies, and the various POSIX clocks, and detailing the various frameworks using them. He also explained the various bugs that might be encountered when having a too naive counter implementation for example, or using the wrong POSIX clock from an application.

See also the video and slides of the talk.

Android Things, Karim Yaghmour

Talk selected by Antoine Ténart

Karim did a very good introduction to Android Things. His talk was a great overview of what this new OS from Google targeting embedded devices is, and where it comes from. He started by showing the history of Android, and he explained what this system brought to the embedded market. He then switched to the birth of Android Things; a reboot of Google’s strategy for connected devices. He finally gave an in depth explanation of the internals of this new OS, by comparing Android Things and Android, with lots of examples and demos.

Android Things replaces Brillo / Weave, and unlike its predecessor is built reusing available tools and services. It’s in fact a lightweight version of Android, with many services removed and a few additions like the PIO API to drive GPIO, I2C, PWM or UART controllers. A few services were replaced as well, most notably the launcher. The result is a not so big, but not so small, system that can run on headless devices to control various sensors; with an Android API for application developers.

See also the video and slides of the talk.

by Thomas Petazzoni at March 10, 2017 09:01 AM

March 07, 2017

Harald Welte

VMware becomes gold member of Linux Foundation: And what about the GPL?

As we can read in recent news, VMware has become a gold member of the Linux foundation. That causes - to say the least - very mixed feelings to me.

One thing to keep in mind: The Linux Foundation is an industry association, it exists to act in the joint interest of it's paying members. It is not a charity, and it does not act for the public good. I know and respect that, while some people sometimes appear to be confused about its function.

However, allowing an entity like VMware to join, despite their many years long disrespect for the most basic principles of the FOSS Community (such as: Following the GPL and its copyleft principle), really is hard to understand and accept.

I wouldn't have any issue if VMware would (prior to joining LF) have said: Ok, we had some bad policies in the past, but now we fully comply with the license of the Linux kernel, and we release all derivative/collective works in source code. This would be a positive spin: Acknowledge past issues, resolve the issues, become clean and then publicly underlining your support of Linux by (among other things) joining the Linux Foundation. I'm not one to hold grudges against people who accept their past mistakes, fix the presence and then move on. But no, they haven't fixed any issues.

They are having one of the worst track records in terms of intentional GPL compliance issues for many years, showing outright disrespect for Linux, the GPL and ultimately the rights of the Linux developers, not resolving those issues and at the same time joining the Linux Foundation? What kind of message sends that?

It sends the following messages:

  • you can abuse Linux, the GPL and copyleft while still being accepted amidst the Linux Foundation Members
  • it means the Linux Foundations has no ethical concerns whatsoever about accepting such entities without previously asking them to become clean
  • it also means that VMware has still not understood that Linux and FOSS is about your actions, particularly the kind of choices you make how to technically work with the community, and not against it.

So all in all, I think this move has seriously damaged the image of both entities involved. I wouldn't have expected different of VMware, but I would have hoped the Linux Foundation had some form of standards as to which entities they permit amongst their ranks. I guess I was being overly naive :(

It's a slap in the face of every developer who writes code not because he gets paid, but because it is rewarding to know that copyleft will continue to ensure the freedom of related code.

UPDATE (March 8, 2017):
 I was mistaken in my original post in that VMware didn't just join, but was a Linux Foundation member already before, it is "just" their upgrade from silver to gold that made the news recently. I stand corrected. Still doesn't make it any better that the are involved inside LF while engaging in stepping over the lines of license compliance.
UPDATE2 (March 8, 2017):
 As some people pointed out, there is no verdict against VMware. Yes, that's true. But the mere fact that they rather distribute derivative works of GPL licensed software and take this to court with an armada of lawyers (instead of simply complying with the license like everyone else) is sad enough. By the time there will be a final verdict, the product is EOL. That's probably their strategy to begin with :/

by Harald Welte at March 07, 2017 11:00 PM

Gory details of USIM authentication sequence numbers

I always though I understood UMTS AKA (authentication and key agreement), including the re-synchronization procedure. It's been years since I wrote tools like osmo-sim-auth which you can use to perform UMTS AKA with a SIM card inserted into a PC reader, i.e. simulate what happens between the AUC (authentication center) in a network and the USIM card.

However, it is only now as the sysmocom team works on 3G support of the dedicated OsmoHLR (outside of OsmoNITB!), that I seem to understand all the nasty little details.

I always thought for re-synchronization it is sufficient to simply increment the SQN (sequence number). It turns out, it isn't as there is a MSB-portion called SEQ and a lower-bit portion called IND, used for some fancy array indexing scheme of buckets of highest-used-SEQ within that IND bucket.

If you're interested in all the dirty details and associated spec references (the always hide the important parts in some Annex) see the discussion between Neels and me in Osmocom redmine issue 1965.

by Harald Welte at March 07, 2017 11:00 PM

March 05, 2017

Harald Welte

GTA04 project halts GTA04A5 due to OMAP3 PoP soldering issues

For those of you who don't know what the tinkerphones/OpenPhoenux GTA04 is: It is a 'professional hobbyist' hardware project (with at least public schematics, even if not open hardware in the sense that editable schematics and PCB design files are published) creating updated mainboards that can be used to upgrade Openmoko phones. They fit into the same enclosure and can use the same display/speaker/microphone.

What the GTA04 guys have been doing for many years is close to a miracle anyway: Trying to build a modern-day smartphone in low quantities, using off-the-shelf components available in those low quantities, and without a large company with its associated financial backing.

Smartphones are complex because they are highly integrated devices. A seemingly unlimited amount of components is squeezed in the tiniest form-factors. This leads to complex circuit boards with many layers that take a lot of effort to design, and are expensive to build in low quantities. The fine-pitch components mandated by the integration density is another issue.

Building the original GTA01 (Neo1937) and GTA02 (FreeRunner) devices at Openmoko, Inc. must seem like a piece of cake compared to what the GTA04 guys are up to. We had a team of engineers that were familiar at last with feature phone design before, and we had the backing of a consumer electronics company with all its manufacturing resources and expertise.

Nevertheless, a small group of people around Dr. Nikolaus Schaller has been pushing the limits of what you can do in a small for fun project, and the have my utmost respect. Well done!

Unfortunately, there are bad news. Manufacturing of their latest generation of phones (GTA04A5) has been stopped due to massive soldering problems with the TI OMAP3 package-on-package (PoP). Those PoPs are basically "RAM chip soldered onto the CPU, and the stack of both soldered to the PCB". This is used to save PCB footprint and to avoid having to route tons of extra (sensitive, matched) traces between the SDRAM and the CPU.

According to the mailing list posts, it seems to be incredibly difficult to solder the PoP stack due to the way TI has designed the packaging of the DM3730. If you want more gory details, see this post and yet another post.

It is very sad to see that what appears to be bad design choices at TI are going to bring the GTA04 project to a halt. The financial hit by having only 33% yield is already more than the small community can take, let alone unused parts that are now in stock or even thinking about further experiments related to the manufacturability of those chips.

If there's anyone with hands-on manufacturing experience on the DM3730 (or similar) TI PoP reading this: Please reach out to the GTA04 guys and see if there's anything that can be done to help them.

UPDATE (March 8, 2017):
 In an earlier post I was asserting that the GTA04 is open hardware (which I actually believed up to that point) until some readers have pointed out to me that it isn't. It's sad it isn't, but still it has my sympathies.

by Harald Welte at March 05, 2017 11:00 PM

March 04, 2017

Free Electrons

Buildroot 2017.02 released, Free Electrons contributions

Buildroot LogoThe 2017.02 version of Buildroot has been released recently, and as usual Free Electrons has been a significant contributor to this release. A total of 1369 commits have gone into this release, contributed by 110 different developers.

Before looking in more details at the contributions from Free Electrons, let’s have a look at the main improvements provided by this release:

  • The big announcement is that 2017.02 is going to be a long term support release, maintained with security and other important fixes for one year. This will allow companies, users and projects that cannot upgrade at each Buildroot release to have a stable Buildroot version to work with, coming with regular updates for security and bug fixes. A few fixes have already been collected in the 2017.02.x branch, and regular point releases will be published.
  • Several improvements have been made to support reproducible builds, i.e the capability of having two builds of the same configuration provide the exact same bit-to-bit output. These are not enough to provide reproducible builds yet, but they are a piece of the puzzle, and more patches are pending for the next releases to move forward on this topic.
  • A package infrastructure for packages using the waf build system has been added. Seven packages in Buildroot are using this infrastructure currently.
  • Support for the OpenRISC architecture has been added, as well as improvements to the support of ARM64 (selection of ARM64 cores, possibility of building an ARM 32-bit system optimized for an ARM64 core).
  • The external toolchain infrastructure, which was all implemented in a single very complicated package, has been split into one package per supported toolchain and a common infrastructure. This makes it much easier to maintain.
  • A number of updates has been made to the toolchain components and capabilities: uClibc-ng bumped to 1.0.22 and enabled for ARM64, mips32r6 and mips64r6, gdb 7.12.1 added and switched to gdb 7.11 as the default, Linaro toolchains updated to 2016.11, ARC toolchain components updated to arc-2016.09, MIPS Codescape toolchains bumped to 2016.05-06, CodeSourcery AMD64 and NIOS2 toolchains bumped.
  • Eight new defconfigs for various hardware platforms have been added, including defconfigs for the NIOSII and OpenRISC Qemu emulation.
  • Sixty new packages have been added, and countless other packages have been updated or fixed.

Buildroot developers at work during the Buildroot Developers meeting in February 2017, after the FOSDEM conference in Brussels.

More specifically, the contributions from Free Electrons have been:

  • Thomas Petazzoni has handled the release of the first release candidate, 2017.02-rc1, and merged 742 patches out of the 1369 commits merged in this release.
  • Thomas contributed the initial work for the external toolchain infrastructure rework, which has been taken over by Romain Naour and finally merged thanks to Romain’s work.
  • Thomas contributed the rework of the ARM64 architecture description, to allow building an ARM 32-bit system optimized for a 64-bit core, and to allow selecting specific ARM64 cores.
  • Thomas contributed the raspberrypi-usbboot package, which packages a host tool that allows to boot a RaspberryPi system over USB.
  • Thomas fixed a large number of build issues found by the project autobuilders, contributing 41 patches to this effect.
  • Mylène Josserand contributed a patch to the X.org server package, fixing an issue with the i.MX6 OpenGL acceleration.
  • Gustavo Zacarias contributed a few fixes on various packages.

In addition, Free Electrons sponsored the participation of Thomas to the Buildroot Developers meeting that took place after the FOSDEM conference in Brussels, early February. A report of this meeting is available on the eLinux Wiki.

The details of Free Electrons contributions:

by Thomas Petazzoni at March 04, 2017 08:53 PM

February 27, 2017

Bunnie Studios

Name that Ware, February 2017

The ware for February 2017 is shown below:

This is a ware contributed by an anonymous reader. Thanks for the contribution, you know who you are!

by bunnie at February 27, 2017 08:04 AM

Winner, Name that Ware January 2017

The Ware for January 2017 is a Philips Norelco shaver, which recently died so I thought I’d take it apart and see what’s inside. It’s pretty similar to the previous generation shaver I was using. Hard to pick a winner — Jimmyjo got the thread on the right track, Adrian got the reference to the prior blog post…from 8 years ago. I think I’ll run with with Jimmyjo as the winner though, since it looks from the time stamps he was the first to push the thread into the general category of electric shaver. Congrats, email me to claim your prize (again)!

by bunnie at February 27, 2017 08:04 AM

February 24, 2017

Free Electrons

Linux 4.10, Free Electrons contributions

After 8 release candidates, Linus Torvalds released the final 4.10 Linux kernel last Sunday. A total of 13029 commits were made between 4.9 and 4.10. As usual, LWN had a very nice coverage of the major new features added during the 4.10 merge window: part 1, part 2 and part 3. The KernelNewbies Wiki has an updated page about 4.10 as well.

On the total of 13029 commits, 116 were made by Free Electrons engineers, which interestingly is exactly the same number of commits we made for the 4.9 kernel release!

Our main contributions for this release have been:

  • For Atmel platforms, Alexandre Belloni added support for the securam block of the SAMA5D2, which is needed to implement backup mode, a deep suspend-to-RAM state for which we will be pushing patches over the next kernel releases. Alexandre also fixed some bugs in the Atmel dmaengine and USB gadget drivers.
  • For Allwinner platforms
    • Antoine Ténart enabled the 1-wire controller on the CHIP platform
    • Boris Brezillon fixed an issue in the NAND controller driver, that prevented from using ECC chunks of 512 bytes.
    • Maxime Ripard added support for the CHIP Pro platform from NextThing, together with many addition of features to the underlying SoC, the GR8 from Nextthing.
    • Maxime Ripard implemented audio capture support in the sun4i-i2s driver, bringing capture support to Allwinner A10 platforms.
    • Maxime Ripard added clock support for the Allwinner A64 to the sunxi-ng clock subsystem, and implemented numerous improvements for this subsystem.
    • Maxime Ripard reworked the pin-muxing driver on Allwinner platforms to use a new generic Device Tree binding, and deprecated the old platform-specific Device Tree binding.
    • Quentin Schulz added a MFD driver for the Allwinner A10/A13/A31 hardware block that provides ADC, touchscreen and thermal sensor functionality.
  • For the RaspberryPi platform
    • Boris Brezillon added support for the Video Encoder IP, which provides composite output. See also our recent blog post about our RaspberryPi work.
    • Boris Brezillon made a number of improvements to clock support on the RaspberryPi, which were needed for the Video Encoder IP support.
  • For the Marvell ARM platform
    • Grégory Clement enabled networking support on the Marvell Armada 3700 SoC, a Cortex-A53 based processor.
    • Grégory Clement did a large number of cleanups in the Device Tree files of Marvell platforms, fixing DTC warnings, and using node labels where possible.
    • Romain Perier contributed a brand new driver for the SPI controller of the Marvell Armada 3700, and therefore enabled SPI support on this platform.
    • Romain Perier extended the existing i2c-pxa driver to support the Marvell Armada 3700 I2C controller, and enabled I2C support on this platform.
    • Romain Perier extended the existing hardware number generator driver for OMAP to also be usable for SafeXcel EIP76 from Inside Secure. This allows to use this driver on the Marvell Armada 7K/8K SoC.
    • Romain Perier contributed support for the Globalscale EspressoBin board, a low-cost development board based on the Marvell Armada 3700.
    • Romain Perier did a number of fixes to the CESA driver, used for the cryptographic engine found on 32-bit Marvell SoCs, such as Armada 370, XP or 38x.
    • Thomas Petazzoni fixed a bug in the mvpp2 network driver, currently only used on Marvell Armada 375, but in the process of being extended to be used on Marvell Armada 7K/8K as well.
  • As the maintainer of the MTD NAND subsystem, Boris Brezillon did a few cleanups in the Tango NAND controller driver, added support for the TC58NVG2S0H NAND chip, and improved the core NAND support to accommodate controllers that have some special timing requirements.
  • As the maintainer of the RTC subsystem, Alexandre Belloni did a number of small cleanups and improvements, especially to the jz4740

Here is the detailed list of our commits to the 4.10 release:

by Thomas Petazzoni at February 24, 2017 09:12 AM

February 23, 2017

Harald Welte

Manual testing of Linux Kernel GTP module

In May 2016 we got the GTP-U tunnel encapsulation/decapsulation module developed by Pablo Neira, Andreas Schultz and myself merged into the 4.8.0 mainline kernel.

During the second half of 2016, the code basically stayed untouched. In early 2017, several patch series of (at least) three authors have been published on the netdev mailing list for review and merge.

This poses the very valid question on how do we test those (sometimes quite intrusive) changes. Setting up a complete cellular network with either GPRS/EGPRS or even UMTS/HSPA is possible using OsmoSGSN and related Osmocom components. But it's of course a luxury that not many Linux kernel networking hackers have, as it involves the availability of a supported GSM BTS or UMTS hNodeB. And even if that is available, there's still the issue of having a spectrum license, or a wired setup with coaxial cable.

So as part of the recent discussions on netdev, I tested and described a minimal test setup using libgtpnl, OpenGGSN and sgsnemu.

This setup will start a mobile station + SGSN emulator inside a Linux network namespace, which talks GTP-C to OpenGGSN on the host, as well as GTP-U to the Linux kernel GTP-U implementation.

In case you're interested, feel free to check the following wiki page: https://osmocom.org/projects/linux-kernel-gtp-u/wiki/Basic_Testing

This is of course just for manual testing, and for functional (not performance) testing only. It would be great if somebody would pick up on my recent mail containing some suggestions about an automatic regression testing setup for the kernel GTP-U code. I have way too many spare-time projects in desperate need of some attention to work on this myself. And unfortunately, none of the telecom operators (who are the ones benefiting most from a Free Software accelerated GTP-U implementation) seems to be interested in at least co-funding or otherwise contributing to this effort :/

by Harald Welte at February 23, 2017 11:00 PM

February 20, 2017

Free Electrons

Free Electrons and Raspberry Pi Linux kernel upstreaming

Raspberry Pi logoFor a few months, Free Electrons has been helping the Raspberry Pi Foundation upstream to the Linux kernel a number of display related features for the Rasperry Pi platform.

The main goal behind this upstreaming process is to get rid of the closed-source firmware that is used on non-upstream kernels every time you need to enable/access a specific hardware feature, and replace it by something that is both open-source and compliant with upstream Linux standards.

Eric Anholt has been working hard to upstream display related features. His biggest contribution has certainly been the open-source driver for the VC4 GPU, but he also worked on the display controller side, and we were contracted to help him with this task.

Our first objective was to add support for SDTV (composite) output, which appeared to be much easier than we imagined. As some of you might already know, the display controller of the Raspberry Pi already has a driver in the DRM subsystem. Our job was to add support for the SDTV encoder (also called VEC, for Video EnCoder). The driver has been submitted just before the 4.10 merge window and surprisingly made it into 4.10 (see also the patches). Eric Anholt explained on his blog:

The Raspberry Pi Foundation recently started contracting with Free Electrons to give me some support on the display side of the stack. Last week I got to review and release their first big piece of work: Boris Brezillon’s code for SDTV support. I had suggested that we use this as the first project because it should have been small and self contained. It ended up that we had some clock bugs Boris had to fix, and a bug in my core VC4 CRTC code, but he got a working patch series together shockingly quickly. He did one respin for a couple more fixes once I had tested it, and it’s now out on the list waiting for devicetree maintainer review. If nothing goes wrong, we should have composite out support in 4.11 (we’re probably a week late for 4.10).

Our second objective was to help Eric with HDMI audio support. The code has been submitted on the mailing list 2 weeks ago and will hopefully be queued for 4.12. This time on, we didn’t write much code, since Eric already did the bulk of the work. What we did though is debugging the implementation to make it work. Eric also explained on his blog:

Probably the biggest news of the last two weeks is that Boris’s native HDMI audio driver is now on the mailing list for review. I’m hoping that we can get this merged for 4.12 (4.10 is about to be released, so we’re too late for 4.11). We’ve tested stereo audio so far, no compresesd audio (though I think it should Just Work), and >2 channel audio should be relatively small amounts of work from here. The next step on HDMI audio is to write the alsalib configuration snippets necessary to hide the weird details of HDMI audio (stereo IEC958 frames required) so that sound playback works normally for all existing userspace, which Boris should have a bit of time to work on still.

On our side, it has been a great experience to work on such topics with Eric, and you should expect more contributions from Free Electrons for the Raspberry Pi platform in the next months, so stay tuned!

by Boris Brezillon at February 20, 2017 04:23 PM

February 15, 2017

Harald Welte

Cellular re-broadcast over satellite

I've recently attended a seminar that (among other topics) also covered RF interference hunting. The speaker was talking about various real-world cases of RF interference and illustrating them in detail.

Of course everyone who has any interest in RF or cellular will know about fundamental issues of radio frequency interference. To the biggest part, you have

  • cells of the same operator interfering with each other due to too frequent frequency re-use, adjacent channel interference, etc.
  • cells of different operators interfering with each other due to intermodulation products and the like
  • cells interfering with cable TV, terrestrial TV
  • DECT interfering with cells
  • cells or microwave links interfering with SAT-TV reception
  • all types of general EMC problems

But what the speaker of this seminar covered was actually a cellular base-station being re-broadcast all over Europe via a commercial satellite (!).

It is a well-known fact that most satellites in the sky are basically just "bent pipes", i.e. they consist of a RF receiver on one frequency, a mixer to shift the frequency, and a power amplifier. So basically whatever is sent up on one frequency to the satellite gets re-transmitted back down to earth on another frequency. This is abused by "satellite hijacking" or "transponder hijacking" and has been covered for decades in various publications.

Ok, but how does cellular relate to this? Well, apparently some people are running VSAT terminals (bi-directional satellite terminals) with improperly shielded or broken cables/connectors. In that case, the RF emitted from a nearby cellular base station leaks into that cable, and will get amplified + up-converted by the block up-converter of that VSAT terminal.

The bent-pipe satellite subsequently picks this signal up and re-transmits it all over its coverage area!

I've tried to find some public documents about this, an there's surprisingly little public information about this phenomenon.

However, I could find a slide set from SES, presented at a Satellite Interference Reduction Group: Identifying Rebroadcast (GSM)

It describes a surprisingly manual and low-tech approach at hunting down the source of the interference by using an old nokia net-monitor phone to display the MCC/MNC/LAC/CID of the cell. Even in 2011 there were already open source projects such as airprobe that could have done the job based on sampled IF data. And I'm not even starting to consider proprietary tools.

It should be relatively simple to have a SDR that you can tune to a given satellite transponder, and which then would look for any GSM/UMTS/LTE carrier within its spectrum and dump their identities in a fully automatic way.

But then, maybe it really doesn't happen all that often after all to rectify such a development...

by Harald Welte at February 15, 2017 11:00 PM

February 13, 2017

Free Electrons

Power measurement with BayLibre’s ACME cape

When working on optimizing the power consumption of a board we need a way to measure its consumption. We recently bought an ACME from BayLibre to do that.

Overview of the ACME

The ACME is an extension board for the BeagleBone Black, providing multi-channel power and temperature measurements capabilities. The cape itself has eight probe connectors allowing to do multi-channel measurements. Probes for USB, Jack or HE10 can be bought separately depending on boards you want to monitor.

acme

Last but not least, the ACME is fully open source, from the hardware to the software.

First setup

Ready to use pre-built images are available and can be flashed on an SD card. There are two different images: one acting as a standalone device and one providing an IIO capture daemon. While the later can be used in automated farms, we chose the standalone image which provides user-space tools to control the probes and is more suited to power consumption development topics.

The standalone image userspace can also be built manually using Buildroot, a provided custom configuration and custom init scripts. The kernel should be built using a custom configuration and the device tree needs to be patched.

Using the ACME

To control the probes and get measured values the Sigrok software is used. There is currently no support to send data over the network. Because of this limitation we need to access the BeagleBone Black shell through SSH and run our commands there.

We can display information about the detected probe, by running:

# sigrok-cli --show --driver=baylibre-acme
Driver functions:
    Continuous sampling
    Sample limit
    Time limit
    Sample rate
baylibre-acme - BayLibre ACME with 3 channels: P1_ENRG_PWR P1_ENRG_CURR P1_ENRG_VOL
Channel groups:
    Probe_1: channels P1_ENRG_PWR P1_ENRG_CURR P1_ENRG_VOL
Supported configuration options across all channel groups:
    continuous: 
    limit_samples: 0 (current)
    limit_time: 0 (current)
    samplerate (1 Hz - 500 Hz in steps of 1 Hz)

The driver has four parameters (continuous sampling, sample limit, time limit and sample rate) and has one probe attached with three channels (PWR, CURR and VOL). The acquisition parameters help configuring data acquisition by giving sampling limits or rates. The rates are given in Hertz, and should be within the 1 and 500Hz range when using an ACME.

For example, to sample at 20Hz and display the power consumption measured by our probe P1:

# sigrok-cli --driver=baylibre-acme --channels=P1_ENRG_PWR \
      --continuous --config samplerate=20
FRAME-BEGIN
P1_ENRG_PWR: 1.000000 W
FRAME-END
FRAME-BEGIN
P1_ENRG_PWR: 1.210000 W
FRAME-END
FRAME-BEGIN
P1_ENRG_PWR: 1.210000 W
FRAME-END

Of course there are many more options as shown in the Sigrok CLI manual.

Beta image

A new image is being developed and will change the way to use the ACME. As it’s already available in beta we tested it (and didn’t come back to the stable image). This new version aims to only use IIO to provide the probes data, instead of having a custom Sigrok driver. The main advantage is many software are IIO aware, or will be, as it’s the standard way to use this kind of sensors with the Linux kernel. Last but not least, IIO provides ways to communicate over the network.

A new webpage is available to find information on how to use the beta image, on https://baylibre-acme.github.io. This image isn’t compatible with the current stable one, which we previously described.

The first nice thing to notice when using the beta image is the Bonjour support which helps us communicating with the board in an effortless way:

$ ping baylibre-acme.local

A new tool, acme-cli, is provided to control the probes to switch them on or off given the needs. To switch on or off the first probe:

$ ./acme-cli switch_on 1
$ ./acme-cli switch_off 1

We do not need any additional custom software to use the board, as the sensors data is available using the IIO interface. This means we should be able to use any IIO aware tool to gather the power consumption values:

  • Sigrok, on the laptop/machine this time as IIO is able to communicate over the network;
  • libiio/examples, which provides the iio-monitor tool;
  • iio-capture, which is a fork of iio-readdev designed by BayLibre for an integration into LAVA (automated tests);
  • and many more..

Conclusion

We didn’t use all the possibilities offered by the ACME cape yet but so far it helped us a lot when working on power consumption related topics. The ACME cape is simple to use and comes with a working pre-built image. The beta image offers the IIO support which improved the usability of the device, and even though it’s in a beta version we would recommend to use it.

by Antoine Ténart at February 13, 2017 03:38 PM

February 12, 2017

Harald Welte

Towards a real SIGTRAN/SS7 stack in libosmo-sigtran

In the good old days ever since the late 1980ies - and a surprising amount even still today - telecom signaling traffic is still carried over circuit-switched SS7 with its TDM lines as physical layer, and not an IP/Ethernet based transport.

When Holger first created OsmoBSC, the BSC-only version of OpenBSC some 7-8 years ago, he needed to implement a minimal subset of SCCP wrapped in TCP called SCCP Lite. This was due to the simple fact that the MSC to which it should operate implemented this non-standard protocol stacking that was developed + deployed before the IETF SIGTRAN WG specified M3UA or SUA came around. But even after those were specified in 2004, the 3GPP didn't specify how to carry A over IP in a standard way until the end of 2008, when a first A interface over IP study was released.

As time passese, more modern MSCs of course still implement classic circuit-switched SS7, but appear to have dropped SCCPlite in favor of real AoIP as specified by 3GPP meanwhile. So it's time to add this to the osmocom universe and OsmoBSC.

A couple of years ago (2010-2013) implemented both classic SS7 (MTP2/MTP3/SCCP) as well as SIGTRAN stackings (M2PA/M2UA/M3UA/SUA in Erlang. The result has been used in some production deployments, but only with a relatively limited feature set. Unfortunately, this code has nto received any contributions in the time since, and I have to say that as an open source community project, it has failed. Also, while Erlang might be fine for core network equipment, running it on a BSC really is overkill. Keep in miond that we often run OpenBSC on really small ARM926EJS based embedded systems, much more resource constrained than any single smartphone during the late decade.

In the meantime (2015/2016) we also implemented some minimal SUA support for interfacing with UMTS femto/small cells via Iuh (see OsmoHNBGW).

So in order to proceed to implement the required SCCP-over-M3UA-over-SCTP stacking, I originally thought well, take Holgers old SCCP code, remove it from the IPA multiplex below, stack it on top of a new M3UA codebase that is copied partially from SUA.

However, this falls short of the goals in several ways:

  • The application shouldn't care whether it runs on top of SUA or SCCP, it should use a unified interface towards the SCCP Provider. OsmoHNBGW and the SUA code already introduce such an interface baed on the SCCP-User-SAP implemented using Osmocom primitives (osmo_prim). However, the old OsmoBSC/SCCPlite code doesn't have such abstraction.
  • The code should be modular and reusable for other SIGTRAN stackings as required in the future

So I found myself sketching out what needs to be done and I ended up pretty much with a re-implementation of large parts. Not quite fun, but definitely worth it.

The strategy is:

And then finally stack all those bits on top of each other, rendering a fairly clean and modern implementation that can be used with the IuCS of the virtually unmodified OsmmoHNBGW, OsmoCSCN and OsmoSGSN for testing.

Next steps in the direction of the AoIP are:

  • Implementation of the MTP-SAP based on the IPA transport
  • Binding the new SCCP code on top of that
  • Converting OsmoBSC code base to use the SCCP-User-SAP for its signaling connection

From that point onwards, OsmoBSC doesn't care anymore whether it transports the BSSAP/BSSMAP messages of the A interface over SCCP/IPA/TCP/IP (SCCPlite) SCCP/M3UA/SCTP/IP (3GPP AoIP), or even something like SUA/SCTP/IP.

However, the 3GPP AoIP specs (unlike SCCPlite) actually modify the BSSAP/BSSMAP payload. Rather than using Circuit Identifier Codes and then mapping the CICs to UDP ports based on some secret conventions, they actually encapsulate the IP address and UDP port information for the RTP streams. This is of course the cleaner and more flexible approach, but it means we'll have to do some further changes inside the actual BSC code to accommodate this.

by Harald Welte at February 12, 2017 11:00 PM

February 11, 2017

Harald Welte

Testing (not only) telecom protocols

When implementing any kind of communication protocol, one always dreams of some existing test suite that one can simply run against the implementation to check if it performs correct in at least those use cases that matter to the given application.

Of course in the real world, there rarely are protocols where this is true. If test specifications exist at all, they are often just very abstract texts for human consumption that you as the reader should implement yourself.

For some (by far not all) of the protocols found in cellular networks, every so often I have seen some formal/abstract machine-parseable test specifications. Sometimes it was TTCN-2, and sometimes TTCN-3.

If you haven't heard about TTCN-3, it is basically a way to create functional tests in an abstract description (textual + graphical), and then compile that into an actual executable tests suite that you can run against the implementation under test.

However, when I last did some research into this several years ago, I couldn't find any Free / Open Source tools to actually use those formally specified test suites. This is not a big surprise, as even much more fundamental tools for many telecom protocols are missing, such as good/complete ASN.1 compilers, or even CSN.1 compilers.

To my big surprise I now discovered that Ericsson had released their (formerly internal) TITAN TTCN3 Toolset as Free / Open Source Software under EPL 1.0. The project is even part of the Eclipse Foundation. Now I'm certainly not a friend of Java or Eclipse by all means, but well, for running tests I'd certainly not complain.

The project also doesn't seem like it was a one-time code-drop but seems very active with many repositories on gitub. For example for the core module, titan.core shows plenty of activity on an almost daily basis. Also, binary releases for a variety of distributions are made available. They even have a video showing the installation ;)

If you're curious about TTCN-3 and TITAN, Ericsson also have made available a great 200+ pages slide set about TTCN-3 and TITAN.

I haven't yet had time to play with it, but it definitely is rather high on my TODO list to try.

ETSI provides a couple of test suites in TTCN-3 for protocols like DIAMETER, GTP2-C, DMR, IPv6, S1AP, LTE-NAS, 6LoWPAN, SIP, and others at http://forge.etsi.org/websvn/ (It's also the first time I've seen that ETSI has a SVN server. Everyone else is using git these days, but yes, revision control systems rather than periodic ZIP files is definitely a big progress. They should do that for their reference codecs and ASN.1 files, too.

I'm not sure once I'll get around to it. Sadly, there is no TTCN-3 for SCCP, SUA, M3UA or any SIGTRAN related stuff, otherwise I would want to try it right away. But it definitely seems like a very interesting technology (and tool).

by Harald Welte at February 11, 2017 11:00 PM

February 10, 2017

Harald Welte

FOSDEM 2017

Last weekend I had the pleasure of attending FOSDEM 2017. For many years, it is probably the most exciting event exclusively on Free Software to attend every year.

My personal highlights (next to meeting plenty of old and new friends) in terms of the talks were:

I was attending but not so excited by Georg Greve's OpenPOWER talk. It was a great talk, and it is an important topic, but the engineer in me would have hoped for some actual beefy technical stuff. But well, I was just not the right audience. I had heard about OpenPOWER quite some time ago and have been following it from a distance.

The LoRaWAN talk couldn't have been any less technical, despite stating technical, political and cultural in the topic. But then, well, just recently 33C3 had the most exciting LoRa PHY Reverse Engineering Talk by Matt Knight.

Other talks whose recordings I still want to watch one of these days:

by Harald Welte at February 10, 2017 11:00 PM

February 05, 2017

Andrew Zonenberg, Silicon Exposed

STARSHIPRAIDER: Input buffer rev 0.1 design and characterization

Working as an embedded systems pentester is a lot of fun, but it comes with some annoying problems. There's so many tools that I can never seem to find the right one. Need to talk to a 3.3V UART? I almost invariably have an FTDI cable configured for 5 or 1.8V on my desk instead. Need to dump a 1.8V flash chip? Most of our flash dumpers won't run below 3.3. Need to sniff a high-speed bus? Most of the Saleae Logic analyzers floating around the lab are too slow to keep up with fast signals, and the nice oscilloscopes don't have a lot of channels. And everyone's favorite jack-of-all-trades tool, the Bus Pirate, is infamous for being slow.

As someone with no shortage of virtual razors, I decided that this yak needed to be shaved! The result was an ongoing project I call STARSHIPRAIDER. There will be more posts on the project in the coming months so stay tuned!

The first step was to decide on a series of requirements for the project:
  • 32 bidirectional I/O ports split into four 8-pin banks.
    This is enough to sniff any commonly encountered embedded bus other than DRAM. Multiple banks are needed to support multiple voltage levels in the same target.
  • Full support for 1.2 to 5V logic levels.This is supposed to be a "Swiss Army knife" embedded systems debug/testing tool. This voltage range encompasses pretty much any signalling voltage commonly encountered in embedded devices.
  • Tolerance to +/- 12V DC levels.Test equipment needs to handle some level of abuse. When you're reverse engineering a board it's easy to hook up ground to the wrong signal, probe a power rail, or even do both at once. The device doesn't have to function in this state (shutting down for protection is OK) but needs to not suffer permanent damage. It's also OK if the protection doesn't handle AC sources - the odds of accidentally connecting a piece of digital test equipment to a big RF power amplifier are low enough that I'm not worried.
  • 500 Mbps input/output rate for each pin.This was a somewhat arbitrary choice, but preliminary math indicated it was feasible. I wanted something significantly faster than existing tools in the class.
  • Ethernet-based interface to host PC.I've become a huge fan of Ethernet and IPv6 as communications interface for my projects. It doesn't require any royalties or license fees, scales from 10 Mbps to >10 Gbps and supports bridging between different link speeds, supports multi-master topologies, and can be bridged over a WAN or VPN. USB and PCIe, the two main alternatives, can do few if any of these.
  • Large data buffer.Most USB logic analyzers have very high peak capture rates, but the back-haul interface to the host PC can't keep up with extended captures at high speed. Commodity DRAM is so cheap that there's no reason to not stick a whole SODIMM of DDR3 in the instrument to provide an extremely deep capture buffer.
  • Multiple virtual instruments connected to a crossbar.Any nontrivial embedded device contains multiple buses of interest to a reverse engineer. STARSHIPRAIDER needs to be able to connect to several at once (on arbitrary pins), bridge them out to separate TCP ports, and allow multiple testers to send test vectors to them independently.
The brain of the system will be fairly straightforward high-speed digital. It will be a 6-8 layer PCB with an Artix-7 FPGA in FGG484 package, a SODIMM socket for 4GB of DDR3 800, a KSZ9031 Gigabit Ethernet PHY, a TLK10232 10gbit Ethernet PHY, and a SFP+ cage, plus some sort of connector (most likely a Samtec Q-strip) for talking to the I/O subsystem on a separate board.

The challenging part of the design, from an architectural perspective, seemed to be the I/O buffer and input protection circuit, so I decided to prototype it first.

STARSHIPRAIDER v0.1 I/O buffer design

A block diagram of the initial buffer design is shown above. The output buffer will be discussed in a separate post once I've had a chance to test it; today we'll be focusing on the input stage (the top half of the diagram).

During normal operation, the protection relay is closed. The series resistor has insignificant resistance compared to the input impedance of the comparator (an ADCMP607), so it can be largely ignored. The comparator checks the input signal against a threshold (chosen appropriately for the I/O standard in use) and sends a differential signal to the host board for processing. But what if something goes wrong?

If the user accidentally connects the probe to a signal outside the acceptable voltage range, a Schottky diode connected to the +5V or ground rail will conduct and shunt the excess voltage safely into the power rails. The series resistor limits fault current to a safe level (below the diode's peak power rating). After a short time (about 150 µs with my current relay driver), the protection relay opens and breaks the circuit.

The relay is controlled by a Silego GreenPAK4 mixed-signal FPGA, running a small design written in Verilog and compiled with my open-source toolchain. The code for the GreenPAK design is on Github.

All well and good in theory... but does it work? I built a characterization board containing a single I/O buffer and loaded with test points and probe connectors. You can grab the KiCAD files for this on Github as well. Here's a quick pic after assembly:

STARSHIPRAIDER I/O characterization board
Initial test results were not encouraging. Positive overvoltage spikes were clamped to +8V and negative spikes were clamped to -1V - well outside the -0.5 to +6V absolute max range of my comparator.
Positive transient response

Negative transient response


After a bit of review of the schematics, I found two errors. The "5V" ESD diode I was using to protect the high side had a poorly controlled Zener voltage and could clamp as high as 8V or 9V. The Schottky on the low side was able to survive my fault current but the forward voltage increased massively beyond the nominal value.

I reworked the board to replace the series resistor with a larger one (39 ohms) to reduce the maximum fault current, replaced the low-side Schottky with one that could handle more current, and replaced the Zener with an identical Schottky clamping to the +5V rail.

Testing this version gave much better results. There was still a small amount of ringing (less than five nanoseconds) a few hundred mV past the limit, but the comparator's ESD diodes should be able to safely dissipate this brief pulse.

Positive transient response, after rework
Negative transient response, after rework
Now it was time to test the actual signal path. My first iteration of the test involved cobbling together a signal path from an FPGA board through the test platform and to the oscilloscope without any termination. The source of the signal was a BNC-to-minigrabber flying lead test clip! Needless to say, results were less than stellar.

PRBS31 eye at 80 Mbps through protection circuit with flying leads and no terminator
After ordering some proper RF test supplies (like an inline 50 ohm BNC terminator), I got much better signal quality. The eye was very sharp and clear at 100 Mbps. It was visibly rounded at 200 Mbps, but rendering a squarewave at that rate requires bandwith much higher than the 100 MHz of my oscilloscope so results were inconclusive.

PRBS31 eye at 100 Mbps through protection circuit with proper cabling
PRBS31 eye at 200 Mbps, limited by oscilloscope bandwidth
I then hooked the protection circuit up to the comparator to test the entire inbound signal chain. While the eye looked pretty good at 100 Mbps (plotting one leg of the differential since my scope was out of channels), at 200 Mbps horrible jitter appeared.

PRBS31 eye at 100 Mbps through full input buffer
PRBS31 eye at 200 Mbps through full input buffer
After quite a bit of scratching my head and fumbling with datasheets, I realized my oscilloscope was the problem by plotting the clock reference I was triggering on. The jitter was visible in this clock as well, suggesting that it was inherent in the oscilloscope's trigger circuit. This isn't too surprising considering I'm really pushing the limits of this scope - I need a better one to do this kind of testing properly.

PRBS31 eye at 200 Mbps plus 200 MHz sync clock
 At this point I've done about all of the input stage testing I can do with this oscilloscope. I'm going to try and rig up a BER tester on the FPGA so I can do PRBS loopback through the protection stage and comparator at higher speeds, then repeat for the output buffer and the protection run in the opposite direction.

I still have more work to do on the protection circuit as well... while it's fine at 100 Mbps, the 2x 10pF Schottky diode parasitic capacitance is seriously degrading my rise times (I calculated an RC filter -3dB point of around 200 MHz, so higher harmonics are being chopped off). I have some ideas on how I can cut this down much less but that will require a board respin and another blog post!

by Andrew Zonenberg (noreply@blogger.com) at February 05, 2017 10:20 AM

February 03, 2017

Free Electrons

Video and slides from Linux Conf Australia

Linux Conf Australia took place two weeks ago in Hobart, Tasmania. For the second time, a Free Electrons engineer gave a talk at this conference: for this edition, Free Electrons CTO Thomas Petazzoni did a talk titled A tour of the ARM architecture and its Linux support. This talk was intended as an introduction-level talk to explain what is ARM, what is the concept behind the ARM architecture and ARM System-on-chip, bootloaders typically used on ARM and the Linux support for ARM with the concept of Device Tree.

The slides of the talk are available in PDF format, and the video is available on Youtube. We got some nice feedback afterwards, which is a good indication a number of attendees found it informative.

All the videos from the different talks are also available on Youtube.

We once again found LCA to be a really great event, and want to thank the LCA organization for accepting our talk proposal and funding the travel expenses. Next year LCA, in 2018, will take place in Sydney, in mainland Australia.

by Thomas Petazzoni at February 03, 2017 01:03 PM

February 02, 2017

Free Electrons

Free Electrons at FOSDEM and the Buildroot Developers Meeting

FOSDEM 2017Like every year, a number of Free Electrons engineers will be attending the FOSDEM conference next week-end, on February 4 and 5, in Brussels. This year, Mylène Josserand and Thomas Petazzoni are going to FOSDEM. Being the biggest European open-source conference, FOSDEM is a great opportunity to meet a large number of open-source developers and learn about new projects.

In addition, Free Electrons is sponsoring the participation of Thomas Petazzoni to the Buildroot Developers meeting, which takes place during two days right after the FOSDEM conference. During this event, the Buildroot developers community gathers to make progress on the project by having discussions on the current topics, and working on the patches that have been submitted and need to be reviewed and merged.

by Thomas Petazzoni at February 02, 2017 01:28 PM

Free Electrons at the Embedded Linux Conference 2017

The next Embedded Linux Conference will take place later this month in Portland (US), from February 21 to 23, with a great schedule of talks. As usual, a number of Free Electrons engineers will attend this event, and we will also be giving a few talks.

Embedded Linux Conference 2017

Free Electrons CEO Michael Opdenacker will deliver a talk on Embedded Linux size reduction techniques, while Free Electrons engineer Quentin Schulz will give a talk on
Power Management Integrated Circuits: Keep the Power in Your Hands
. In addition, Free Electrons engineers Maxime Ripard, Antoine Ténart and Mylène Josserand will be attending the conference.

We once again look forward to meeting our fellow members of the embedded Linux and Linux kernel communities!

by Thomas Petazzoni at February 02, 2017 08:46 AM

January 31, 2017

Harald Welte

Osmocom Conference 2017 on April 21st

I'm very happy that in 2017, we will have the first ever technical conference on the Osmocom cellular infrastructure projects.

For many years, we have had a small, invitation only event by Osmocom developers for Osmocom developers called OsmoDevCon. This was fine for the early years of Osmocom, but during the last few years it became apparent that we also need a public event for our many users. Those range from commercial cellular operators to community based efforts like Rhizomatica, and of course include the many research/lab type users with whom we started.

So now we'll have the public OsmoCon on April 21st, back-to-back with the invitation-only OsmoDevcon from April 22nd through 23rd.

I'm hoping we can bring together a representative sample of our user base at OsmoCon 2017 in April. Looking forward to meet you all. I hope you're also curious to hear more from other users, and of course the development team.

Regards,
Harald

by Harald Welte at January 31, 2017 11:00 PM

January 30, 2017

Bunnie Studios

Name that Ware January 2017

The Ware for January 2017 is shown below:

This close-up view shows about a third of the circuit board. If it turns out to be too difficult to guess from the clues shown here, I’ll update this post with a full-board photo; but I have a feeling long-time players of Name that Ware might have too easy a time with this one.

by bunnie at January 30, 2017 12:30 PM

Winner, Name that Ware December 2016

The ware for December 2016 is a diaper making machine. The same machine can be configured for making sanitary napkins or diapers by swapping out the die cut rollers and base material; in fact, the line next to the one pictured was producing sanitary napkins at the time this photo was taken. Congrats to Stuart for the first correct guess, email me for your prize!

by bunnie at January 30, 2017 12:30 PM

January 26, 2017

ZeptoBARS

Analog Devices AD584 - precision voltage reference : weekend die-shot

Analog Devices AD584 is a voltage reference with 4 outputs : 2.5, 5, 7.5 and 10V. Tempco is laser trimmed to 15ppm/°C and voltage error to ~0.1%.


Die size 2236x1570 µm.

One can refer to a die photo from AD datasheet showing a bit older design of the same chip:

January 26, 2017 09:01 PM

January 22, 2017

Harald Welte

Autodesk: How to lose loyal EAGLE customers

A few days ago, Autodesk has announecd that the popular EAGLE electronics design automation (EDA) software is moving to a subscription based model.

When previously you paid once for a license and could use that version/license as long as you wanted, there now is a monthly subscription fee. Once you stop paying, you loose the right to use the software. Welcome to the brave new world.

I have remotely observed this subscription model as a general trend in the proprietary software universe. So far it hasn't affected me at all, as the only two proprietary applications I use on a regular basis during the last decade are IDA and EAGLE.

I already have ethical issues with using non-free software, but those two cases have been the exceptions, in order to get to the productivity required by the job. While I can somehow convince my consciousness in those two cases that it's OK - using software under a subscription model is completely out of the question, period. Not only would I end up paying for the rest of my professional career in order to be able to open and maintain old design files, but I would also have to accept software that "calls home" and has "remote kill" features. This is clearly not something I would ever want to use on any of my computers. Also, I don't want software to be associated with any account, and it's not the bloody business of the software maker to know when and where I use my software.

For me - and I hope for many, many other EAGLE users - this move is utterly unacceptable and certainly marks the end of any business between the EAGLE makers and myself and/or my companies. I will happily use my current "old-style" EAGLE 7.x licenses for the near future, and theS see what kind of improvements I would need to contribute to KiCAD or other FOSS EDA software in order to eventually migrate to those.

As expected, this doesn't only upset me, but many other customers, some of whom have been loyal to using EAGLE for many years if not decades, back to the DOS version. This is reflected by some media reports (like this one at hackaday or user posts at element14.com or eaglecentral.ca who are similarly critical of this move.

Rest in Peace, EAGLE. I hope Autodesk gets what they deserve: A new influx of migrations away from EAGLE into the direction of Open Source EDA software like KiCAD.

In fact, the more I think about it, I'm actually very much inclined to work on good FOSS migration tools / converters - not only for my own use, but to help more people move away from EAGLE. It's not that I don't have enough projects at my hand already, but at least I'm motivated to do something about this betrayal by Autodesk. Let's see what (if any) will come out of this.

So let's see it that way: What Autodesk is doing is raising the level off pain of using EAGLE so high that more people will use and contribute FOSS EDA software. And that is actually a good thing!

by Harald Welte at January 22, 2017 11:00 PM

January 20, 2017

Elphel

Lapped MDCT-based image conditioning with optical aberrations correction, color conversion, edge emphasis and noise reduction

Fig.1. Image comparison of the different processing stages output

Results of the processing of the color image

Previous blog post “Lens aberration correction with the lapped MDCT” described our experiments with the lapped MDCT[1] for optical aberration corrections of a single color channel and separation of the asymmetrical kernel into a small asymmetrical part for direct convolution and a larger symmetrical one to be applied in the frequency domain of the MDCT. We supplemented this processing chain with additional steps of the image conditioning to evaluate the overall quality of the of the results and feasibility of the MDCT approach for processing in the camera FPGA.

Image comparator in Fig.1 allows to see the difference between the images generated from the results of the several stages of the processing. It makes possible to compare any two of the image layers by either sliding the image separator or by just clicking on the image – that alternates right/left images. Zoom is controlled by the scroll wheel (click on the zoom indicator fits image), pan – by dragging.

Original image was acquired with Elphel model 393 camera with 5 Mpix MT9P006 image sensor and Sunex DSL227 fisheye lens, saved in jp4 format as a raw Bayer data at 98% compression quality. Calibration was performed with the Java program using calibration pattern visible in the image itself. The program is designed to work with the low-distortion lenses so fisheye was a stretch and the calibration kernels near the edges are just replicated from the ones closer to the center, so aberration correction is only partial in those areas.

First two layers differ just by added annotations, they both show output of a simple bilinear demosaic processing, same as generated by the camera when running in JPEG mode. Next layers show different stages of the processing, details are provided later in this blog post.

Linear part of the image conditioning: convolution and color conversion

Correction of the optical aberrations in the image can be viewed as convolution of the raw image array with the space-variant kernels derived from the optical point spread functions (PSF). In the general case of the true space-variant kernels (different for each pixel) it is not possible to use DFT-based convolution, but when the kernels change slowly and the image tiles can be considered isoplanatic (areas where PSF remains the same to the specified precision) it is possible to apply the same kernel to the whole image tile that is processed with DFT (or combined convolution/MDCT in our case). Such approach is studied in deep for astronomy [2],[3] (where they almost always have plenty of δ-function light sources to measure PSF in the field of view :-)).

The procedure described below is a combination of the sparse kernel convolution in the space domain with the lapped MDCT processing making use of its perfect (only approximate with the variant kernels) reconstruction property, but it still implements the same convolution with the variant kernels.

Signal flow is presented in Fig.2. Input signal is the raw image data from the sensor sampled through the color filter array organized as a standard Bayer mosaic: each 2×2 pixel tile includes one of the red and blue samples, and 2 of the green ones.

In addition to the image data the process depends on the calibration data – pairs of asymmetrical and symmetrical kernels calculated during camera calibration as described in the previous blog post.

Fig.2. Signal flow for linear part of MDCT-based aberration correction

Fig.2. Signal flow of the linear part of MDCT-based image conditioning

Image data is processed in the following sequence of the linear operations, resulting in intensity (Y) and two color difference components:

  1. Input composite signal is split by colors into 3 separate channels producing sparse data in each.
  2. Each channel data is directly convolved with a small (we used just four non-zero elements) asymmetrical kernel AK, resulting in a sequence of 16×16 pixel tiles, overlapping by 8 pixels (input pixels are not limited to 16×16 tiles).
  3. Each tile is multiplied by a window function, folded and converted with 8×8 pixel DCT-IV[4] – equivalent of the 16×16->8×8 MDCT.
  4. 8×8 result tiles are multiplied by symmetrical kernels (SK) – equivalent of convolving the pre-MDCT signal.
  5. Each channel is subject to the low-pass filter that is implemented by multiplying in the frequency domain as these filters are indeed symmetrical. The cutoff frequency is different for the green (LPF1) and other (LPF2) colors as there are more source samples for the first. That was the last step before inverse transformation presented in the previous blog post, now we continued with a few more.
  6. Natural images have strong correlation between different color channels so most image processing (and compression) algorithms involve converting the pure color channels into intensity (Y) and two color difference signals that have lower bandwidth than intensity. There are different standards for the color conversion coefficients and here we are free to use any as this process is not a part of a matched encoder/decoder pair. All such conversions can be represented as a 3×3 matrix multiplication by the (r,g,b) vector.
  7. Two of the output signals – color differences are subject to an additional bandwidth limiting by LPF3.
  8. IMDCT includes 8×8 DCT-IV, unfolding 8×8 into 16×16 tiles, second multiplication by the window function and accumulation of the overlapping tiles in the pixel domain.

Nonlinear image enhancement: edge emphasis, noise reduction

For some applications the output data is already useful – ideally it has all the optical aberrations compensated so the remaining far-reaching inter-pixel correlation caused by a camera system is removed. Next steps (such as stereo matching) can be done on- (or off-) line, and the algorithms do not have to deal with the lens specifics. Other applications may benefit from additional processing that improves image quality – at least the perceived one.

Such processing may target the following goals:

  1. To reduce remaining signal modulation caused by the Bayer pattern (each source pixel carries data about a single color component, not all 3), trying to remove it by a LPF would blur the image itself.
  2. Detect and enhance edges, as most useful high-frequency elements represent locally linear features
  3. Reduce visible noise in the uniform areas (such as blue sky) where significant (especially for the small-pixel sensors) noise originates from the shot noise of the pixels. This noise is amplified by the aberration correction that effectively increases the high frequency gain of the system.

Some of these three goals overlap and can be addressed simultaneously – edge detection can improve de-mosaic quality and reduce related colored artifacts on the sharp edges if the signal is blurred along the edges and simultaneously sharpened in the orthogonal direction. Areas that do not have pronounced linear features are likely to be uniform and so can be low-pass filtered.

The non-linear processing produces modified pixel value using 3×3 pixel array centered around the current pixel. This is a two-step process:

  • First the 3×3 center-symmetric matrices (one for Y, another for color) of coefficients are calculated using the Y channel data, then
  • they are applied to the Y and color components by replacing the pixel value with the inner product of the calculated coefficients and the original data.

Signal flow for one channel is presented in Fig.3:

Fig.3 Non-linear image processing: edge emphasis and noise reduction

Fig.3 Non-linear image processing: edge emphasis and noise reduction

  1. Four inner products are calculated for the same 9-sample Y data and the shown matrices (corresponding to second derivatives along vertical, horizontal and the two diagonal directions).
  2. Each of these values is squared and
  3. the following four 3×3 matrices are multiplied by these values. Matrices are symmetrical around the center, so gray-colored cells do not need to be calculated.
  4. Four matrices are then added together and scaled by a variable parameter K1. The first two matrices are opposite to each other, and so are the second two. So if the absolute value of the two orthogonal second derivatives are equal (no linear features detected), the corresponding matrices will annihilate each other.
  5. A separate 3×3 matrix representing a weighted running average, scaled by K2 is added for noise reduction.
  6. The sum of the positive values is compared to a specified threshold value, and if it exceed it – all the matrix is proportionally scaled down – that makes different line directions to “compete” against each other and against the blurring kernel.
  7. The sum of all 9 elements of the calculated array is zero, so the default unity kernel is added and when correction coefficients are zeros, the result pixels will be the same as the input ones.
  8. Inner product of the calculated 9-element array and the input data is calculated and used as a new pixel value. Two of the arrays are created from the same Y channel data – one for Y and the other for two color differences, configurable parameters (K1, K2, threshold and the smoothing matrix) are independent in these two cases.

Next steps

How much is it possible to warp?

The described method of the optical aberration correction is tested with the software implementation that uses only operations that can be ported to the FPGA, so we are almost ready to get back to to Verilog programming. One more thing to try before is to see if it is possible to merge this correction with a minor distortion correction. DFT and DCT transforms are not good at scaling the images (when using the same pixel grid). It is definitely not possible no rectify large areas of the fisheye images, but maybe small (fraction of a pixel per tile) stretching can still be absorbed in the same step with shifting? This may have several implications.

Single-step image rectification

It would be definitely attractive to eliminate additional processing step and save FPGA resources and/or decrease the processing time. But there is more than that – re-sampling degrades image resolution. For that reason we use half-pixel grid for the offline processing, but it increases amount of data 4 times and processing resources – at least 4 times also.

When working with the whole pixel grid (as we plan to implement in the camera FPGA) we already deal with the partial pixel shifts during convolution for aberration correction, so it would be very attractive to combine these two fractional pixel shifts into one (calibration process uses half-pixel grid) and so to avoid double re-sampling and related image degrading.

Using analytical lens distortion model with the precision of the pixel mapping

Another goal that seems achievable is to absorb at least the table-based pixel mapping. Real lenses can only to some precision be described by an analytical formula of a radial distortion model. Each element can have errors and the multi-lens assembly can inevitably has some mis-alignments – all that makes the lenses different and deviate from a perfect symmetry of the radial model. When we were working with (rather low distortion) wide angle lenses Evetar N125B04530W we were able to get to 0.2-0.3 pix root mean square of the reprojection error in a 26-lens camera system when using radial distortion model with up to the 8-th power of the radial polynomial (insignificant improvement when going from 6-th to the 8-th power). That error was reduced to 0.05..0.07 pixels when we implemented table-based pixel mapping for the remaining (after radial model) distortions. The difference between one of the standard lens models – polynomial for the low-distortion ones and f-theta for fisheye and “f-theta” lenses (where angle from optical axis approximately linearly depends on the distance from the center in the focal plane) is rather small, so it is a good candidate to be absorbed by the convolution step. While this will not eliminate re-sampling when the image will be rectified, this distortion correction process will have a simple analytical formula (already supported by many programs) and will not require a full pixel mapping table.

High resolution Z-axis (distance) measurement with stereo matching of multiple images

Image rectification is an important precondition to perform correlation-based stereo matching of two or more images. When finding the correlation between the images of a relatively large and detailed object it is easy to get resolution of a small fraction of a pixel. And this proportionally increases the distance measurement precision for the same base (distance between the individual cameras). Among other things (such as mechanical and thermal stability of the system) this requires precise measurement of the sub-camera distortions over the overlapping field of view.

When correlating multiple images the far objects (most challenging to get precise distance information) have low disparity values (may be just few pixels), so instead of the complete rectification of the individual images it may be sufficient to have a good “mutual rectification”, so the processed images of the object at infinity will match on each of the individual images with the same sub-pixel resolution as we achieved for off-line processing. This will require to mechanically orient each sub-camera sensor parallel to the others, point them in the same direction and preselect lenses for matching focal length. After that (when the mechanical match is in reasonable few percent range) – perform calibration and calculate the convolution kernels that will accommodate the remaining distortion variations of the sub-cameras. In this case application of the described correction procedure in the camera will result in the precisely matched images ready for correlation.

These images will not be perfectly rectified, and measured disparity (in pixels) as well as the two (vertical and horizontal) angles to the object will require additional correction. But this X/Y resolution is much less critical than the resolution required for the Z-measurements and can easily tolerate some re-sampling errors. For example, if a car at a distance of 20 meters is viewed by a stereo camera with 100 mm base, then the same pixel error that corresponds to a (practically negligible) 10 mm horizontal shift will lead to a 2 meter error (10%) in the distance measurement.

References

[1] Malvar, Henrique S. Signal processing with lapped transforms. Artech House, 1992.

[2] Thiébaut, Éric, et al. “Spatially variant PSF modeling and image deblurring.” SPIE Astronomical Telescopes+ Instrumentation. International Society for Optics and Photonics, 2016. pdf

[3] Řeřábek, M., and P. Pata. “The space variant PSF for deconvolution of wide-field astronomical images.” SPIE Astronomical Telescopes+ Instrumentation. International Society for Optics and Photonics, 2008.pdf

[4] Britanak, Vladimir, Patrick C. Yip, and Kamisetty Ramamohan Rao. Discrete cosine and sine transforms: general properties, fast algorithms and integer approximations. Academic Press, 2010.

by Andrey Filippov at January 20, 2017 04:55 AM

January 16, 2017

ZeptoBARS

Vishay TSOP4838 - IR receiver module : weekend die-shot

Vishay TSOP4838 - decodes IR commands sent with 38kHz modulation. This modulation (and narrow-band filter on receiver module) is required to eliminate ambient light sources which could flicker somewhere at 50-100Hz or 20-30kHz (bad CFL/LEDs). Black (IR transparent) plastic also helps with background noise.


Die size 590x594 µm.

Photodiode is on a separate die:

Die size 1471x1471 µm

January 16, 2017 02:37 PM

January 08, 2017

Altus Metrum

embedded-arm-libc

Finding a Libc for tiny embedded ARM systems

You'd think this problem would have been solved a long time ago. All I wanted was a C library to use in small embedded systems -- those with a few kB of flash and even fewer kB of RAM.

Small system requirements

A small embedded system has a different balance of needs:

  • Stack space is limited. Each thread needs a separate stack, and it's pretty hard to move them around. I'd like to be able to reliably run with less than 512 bytes of stack.

  • Dynamic memory allocation should be optional. I don't like using malloc on a small device because failure is likely and usually hard to recover from. Just make the linker tell me if the program is going to fit or not.

  • Stdio doesn't have to be awesomely fast. Most of our devices communicate over full-speed USB, which maxes out at about 1MB/sec. A stdio setup designed to write to the page cache at memory speeds is over-designed, and likely involves lots of buffering and fancy code.

  • Everything else should be fast. A small CPU may run at only 20-100MHz, so it's reasonable to ask for optimized code. They also have very fast RAM, so cycle counts through the library matter.

Available small C libraries

I've looked at:

  • μClibc. This targets embedded Linux systems, and also appears dead at this time.

  • musl libc. A more lively project; still, definitely targets systems with a real Linux kernel.

  • dietlibc. Hasn't seen any activity for the last three years, and it isn't really targeting tiny machines.

  • newlib. This seems like the 'normal' embedded C library, but it expects a fairly complete "kernel" API and the stdio bits use malloc.

  • avr-libc. This has lots of Atmel assembly language, but is otherwise ideal for tiny systems.

  • pdclib. This one focuses on small source size and portability.

Current AltOS C library

We've been using pdclib for a couple of years. It was easy to get running, but it really doesn't match what we need. In particular, it uses a lot of stack space in the stdio implementation as there's an additional layer of abstraction that isn't necessary. In addition, pdclib doesn't include a math library, so I've had to 'borrow' code from other places where necessary. I've wanted to switch for a while, but there didn't seem to be a great alternative.

What's wrong with newlib?

The "obvious" embedded C library is newlib. Designed for embedded systems with a nice way to avoid needing a 'real' kernel underneath, newlib has a lot going for it. Most of the functions have a good balance between speed and size, and many of them even offer two implementations depending on what trade-off you need. Plus, the build system 'just works' on multi-lib targets like the family of cortex-m parts.

The big problem with newlib is the stdio code. It absolutely requires dynamic memory allocation and the amount of code necessary for 'printf' is larger than the flash space on many of our devices. I was able to get a cortex-m3 application compiled in 41kB of code, and that used a smattering of string/memory functions and printf.

How about avr libc?

The Atmel world has it pretty good -- avr-libc is small and highly optimized for atmel's 8-bit avr processors. I've used this library with success in a number of projects, although nothing we've ever sold through Altus Metrum.

In particular, the stdio implementation is quite nice -- a 'FILE' is effectively a struct containing pointers to putc/getc functions. The library does no buffering at all. And it's tiny -- the printf code lacks a lot of the fancy new stuff, which saves a pile of space.

However, much of the places where performance is critical are written in assembly language, making it pretty darn hard to port to another processor.

Mixing code together for fun and profit!

Today, I decided to try an experiment to see what would happen if I used the avr-libc stdio bits within the newlib environment. There were only three functions written in assembly language, two of them were just stubs while the third was a simple ultoa function with a weird interface. With those coded up in C, I managed to get them wedged into newlib.

Figuring out the newlib build system was the only real challenge; it's pretty awful having generated files in the repository and a mix of autoconf 2.64 and 2.68 version dependencies.

The result is pretty usable though; my STM 32L discovery board demo application is only 14kB of flash while the original newlib stdio bits needed 42kB and that was still missing all of the 'syscalls', like read, write and sbrk.

Here's gitweb pointing at the top of the tiny-stdio tree:

gitweb

And, of course you can check out the whole thing

git clone git://keithp.com/git/newlib

'master' remains a plain upstream tree, although I do have a fix on that branch. The new code is all on the tiny-stdio branch.

I'll post a note on the newlib mailing list once I've managed to subscribe and see if there is interest in making this option available in the upstream newlib releases. If so, I'll see what might make sense for the Debian libnewlib-arm-none-eabi packages.

by keithp's rocket blog at January 08, 2017 07:32 AM

Elphel

Lens aberration correction with the lapped MDCT

Modern small-pixel image sensors exceed resolution of the lenses, so it is the optics of the camera, not the raw sensor “megapixels” that define how sharp are the images, especially in the off-center areas. Multi-sensor camera systems that depend on the tiled images do not have any center areas, so overall system resolution may be as low as that of is its worst part.

Fig. 1. Lateral chromatic aberration and Bayer mosaic: a) monochrome (green) PSF, b) composite color PSF, c) Bayer mosaic of the sensor (direction of aberration shown), d) distorted mosaic matching chromatic aberration in b).

Fig. 1. Lateral chromatic aberration and Bayer mosaic: a) monochrome (green) PSF, b) composite color PSF, c) Bayer mosaic of the sensor, d) distorted mosaic for the chromatic aberration of b).

De-mosaic processing and chromatic aberrations

Our current cameras role is to preserve the raw sensor data while providing some moderate compression, all the image correction is applied during post-processing. Handling the lens aberration has to be done before color conversion (or de-mosaicing). When converting Bayer data to color images most cameras start with the calculation of the “missing” colors in the RG/GB pattern using 3×3 or 5×5 kernels, this procedure relies on the specific arrangement of the color filters.

Each of the red and blue pixels has 4 green ones at the same distance (pixel pitch) and 4 of the opposite (R for B and B for R) color at the equidistant diagonal locations. Fig.1. shows how lateral chromatic aberration disturbs these relations.

Fig.1a is the point-spread function (PSF) of the green channel of the sensor. The resolution of the PSF measurement is twice higher than the pixel pitch, so the lens is not that bad – horizontal distance between the 2 greens in Fig.1c corresponds to 4 pixels of Fig.1a. It is also clearly visible that the PSF is elongated and the radial resolution in this part of the image is better than the tangential one (lens center is left-down).

Fig.1b shows superposition of the 3 color channels: blue center is shifted up-and-right by approximately 2 PSF pixels (so one actual pixel period of the sensor) and the red one – half-pixel left-and-down from the green center. So the point light of a star, centered around some green pixel will not just spread uniformly to the two “R”s and two “B”s shown connected with lines in Fig.1c, but the other ones and in different order. Fig.1d illustrates the effective positions of the sensor pixels that match the lens aberration.

Aberrations correction at post-processing stage

When we perform off-line image correction we start with separating each color channel and re-sampling it at twice the pixel pitch frequency (adding zero sample between each measured one) – this increase allows to shift image by a fraction of a pixel both preserving resolution and not introducing the phase errors that may be visually OK but hurt when relying on sub-pixel resolution during correlation of images.

Next is the conversion of the full image into the overlapping square tiles to the frequency domain using 2-d DFT, then multiplication by the inverted PSF kernels – individual for each color channel and each part of the whole image (calibration procedure provides a 2-d array of PSF kernels). Such multiplication in the frequency domain is equivalent to (much more computationally expensive) image convolution (or deconvolution as the desired result is to reduce the effect of the convolution of the ideal image with the PSF of the actual lens). This is possible because of the famous convolution-multiplication property of Fourier transform and its discrete versions.

After each color channel tile is corrected and the phases of color components match (lateral chromatic aberration is compensated) it is the time when the data may be subject to non-linear processing that relies on the properties of the images (like detection of lines and edges) to combine the color channels trying to achieve highest spacial resolution and not to introduce color artifacts. Our current software performs it while data is in the frequency domain, before the inverse Fourier transform and merging the lapped tiles to the restored image.

Fig.2. Histogram of difference between the original image and the one after direct and inverse MDCT (with 8x8 pixels DCT-IV)

Fig.2. Histogram of difference between original image and after direct and inverse MDCT (with 8×8 pixels DCT-IV)

MDCT of an image – there and back again

It would be very appealing to use DCT-based MDCT instead of DFT for aberration correction. With just 8×8 point DCT-IV it may be possible to calculate direct 16×16 -> 8×8 MDCT and 8×8 -> 16×16 IMDCT providing perfect reconstruction of the image. 8×8 pixels DCT should be able to handle convolution kernels with 8 pixel radius – same would require 16×16 pixels DFT. I knew there will be a challenge to handle non-symmetrical kernels but first I gave a try to a 2-d MDCT to convert and reconstruct back a camera image that way. I was not able to find an efficient Java implementation of the DCT-IV so I had to write some code following the algorithms presented in [1].

That worked nicely – when I obtained a histogram of the difference between the original image (pixel values were in the range of 0 to 255) and the restored one – IMDCT(MDCT(original)) it demonstrated negligible error. Of course I had to discard 8 pixel border of the image added by replication before the procedure – these border pixels do not belong to 4 overlapping tiles as all internal ones and so can not be reconstructed.

When this will be done in the camera FPGA the error will be higher – DCT implementation there uses just an integer DSP – not capable of the double precision calculations as the Java code. But for the small 8×8 transformations it should be rather easy to manage calculation precision to the required level.

Convolution with MDCT

It was also easy to implement a low-pass symmetrical filter by multiplying 8×8 pixel MDCT output tiles by a DCT-III transform of the desired convolution kernel. To convolve f ☼ g you need to multiply DCT_IV(f) by DCT_III(g) in the transform domain [2], but that does not mean that DCT-III has also be implemented in the FPGA – the de-convolution kernels can be prepared during off-line calibration and provided to the camera in the required form.

But not much more can be done for the convolution with asymmetric kernels – they either require additional DST (so DCT and DST) of the image and/or padding data with extra zeros [3],[4] – all that reduces advantage of DCT compared to DFT. Asymmetric kernels are required for the lens aberration corrections and Fig.1 shows two cases not easily suitable for MDCT:

  • lateral chromatic aberrations (or just shift in the image domain) – Fig.1b and
  • “diagonal” kernels (Fig.1a) – not an even function of each of the vertical and horizontal axes.
Fig.3. Convolution kernel factorization: a) required asymmetrical and shifted kernel, b) 4-point direct convolution with (sparse) Bayer color channel data, c) symmetric convolution kernel for MDCT, d) symmetric kernel (DCT-III of c) ) to multiply DCT-IV kernels of the image

Fig.3. Convolution kernel factorization: a) required asymmetrical and shifted kernel, b) 4-point direct convolution with (sparse) Bayer color channel data, c) symmetric convolution kernel for MDCT, d) symmetric kernel – DCT-III of c) to multiply DCT-IV kernels of the image.

Symmetric kernels are like what you can do with a twice folded piece of paper, cut to some shape and unfolded, with folds oriented strictly vertically and horizontally.

Factorization of the convolution

Another way to handle convolution with non-symmetrical kernels is to split it in two – first convolve with an asymmetrical one directly and then – use MDCT and symmetrical kernel. The input data for combined convolution is split Bayer data, so each color channel receives sparse sequence – green one has only half non-zero elements and red and blue – only 1/4 such pixels. In the case of half-pixel grid (to handle fraction-pixel shifts) the relative amount of non-zero pixels is four times smaller, so the total number of multiplications is the same as for the whole-pixel grid.

The goal of such factorization is to minimize the number of the non-zero elements in the asymmetrical kernel, imposing no restrictions on the symmetrical one. Factorization does not have to be absolutely precise – the effect of deconvolution is limited by several factors, most important being the amplification of the sensor noise (such as shot noise). The required number of non-zero pixel may vary with the type of the distortion, for the lens we experimented with (Sunex DSL227 fisheye) just 4 pixels were sufficient to achieve 2-4% error for each of the kernel tiles. Four pixel kernels make it 1 multiplication per each of the red and blue pixels and 2 multiplications per green. As the kernels are calculated during the camera off-line calibration it should be possible to simultaneously generate scheduling of the the DSP and buffer memories to additionally reduce the required run-time FPGA resources.

Fig.3 illustrates how the deconvolution kernel for the aberration correction is split into two for the consecutive convolutions. Fig.1a shows the required deconvolution kernel determined during the existing calibration procedure. This kernel is shown far off-center even for the green channel – it appeared near the edge of the fish-eye lens field of view as the current lens model is based on the radial polynomial and is not efficient for the fish-eye (f-theta) lenses, so aberration correction by deconvolution had to absorb that extra shift. As the convolution kernel has fixed non-zero elements, the computation complexity does not depend on the maximal kernel dimensions. Fig.3b shows the determined asymmetric convolution kernel of 4 pixels, and Fig.3c – the kernel for symmetric convolution with MDCT, the unique 8×8 pixels part of it (inside of the red square) is replicated to the other3 quadrants by mirroring along row 0 and column 0 because of the whole pixel even symmetry – right boundary condition for DCT-III. Fig.3d contains result of the DCT-III applied to the data shown in Fig.3c.

Fig.4. Symmetric convolution kernel tiles in MDCT domain

Fig.4. Symmetric convolution kernel tiles in MDCT domain. Full image (click to open) has peripheral kernels replicated as there was no calibration data outside of the fisheye lens filed of view

There should be some more efficient ways to find optimal combinations of the two kernels, currently I used a combination of the Levenberg-Marquardt Algorithm (LMA) that minimizes approximation error (root mean square of the differences between the given kernel and the result of the convolution of the two calculated) and adding/replacing pixels in the asymmetrical kernel, sorting the variants for the best LMA fit. Experimental code (FactorConvKernel.java) for the kernel calculation is in the same git repository.

Each kernel tile is processed independently of the neighbors, so while the aberration deconvolution kernels are changing smoothly between the adjacent tiles, the individual asymmetrical (for direct convolution with Bayer signal data) and symmetrical (for convolution by multiplication in the MDCT space) may change dramatically (see Fig.4). But when the direct convolution is applied before the window multiplication to the source pixels that contribute to a 16×16 pixel MDCT overlapping tile, then the result (after IMDCT) depends on the convolution of the two kernels, not the individual ones.

Deconvolving the test image

Next step was to apply the convolution to the test image, see if there are any visible blocking (or other) artifacts and if the image sharpness was improved. Only a single (green) channel was tested as there is no DCT-based color conversion code in this program yet. Program was tested with the whole pixel grid (not half pixel) so some reduction of sharpness caused by fractional pixel shift was expected. For the comparison “before/after” aberration correction I used two pairs – one with the raw Bayer data (half of the pixels are black in a checker-board pattern) and the other – with the Bayer pattern after 0.4 pix low-pass filter to reduce the checkerboard pattern. Without this filtering image would be either twice darker or (as on these pictures) saturated at lower levels (checkerboard 0/255 alternating pixels result in average gray level of only half of the full range).

Fig.5. Alternating image segment (green channel only): low-pass filter of the Bayer mosaic before and after deconvolution.

Fig.5. Alternating images of a segment (green channel only): low-pass filter of the Bayer mosaic before and after deconvolution. Click image to show comparison with raw Bayer component.
Raw Bayer
Bayer data, low pass filter, sigma = 0.4 pix
Deconvolved

Fig.5 shows animated GIF of a fraction of the whole image, clicking the image shows comparison to the raw Bayer (with the limited gray level), caption links the full size images for these 3 modes.

No de-noise code is used, so amplification of the pixel shot noise is clearly visible, especially on the uniform surfaces, but aliasing cancellation remained functional even with abrupt changing of the convolution kernels as ones shown in Fig.4.

Conclusions

Algorithms suitable for FPGA implementation are tested with the simulation code. Processing of the images subject to the typical optical aberration of the fisheye lens DSL227 does not add significantly to the computational complexity compared to the pure symmetric convolution using lapped MDCT based on the 8×8 pixels two-dimensional DCT-IV.

This solution can be used as a first stage of the real time image correction and rectification, capable of sub-pixel resolution in multiple application areas, such as 3-d reconstruction and autonomous navigation.

References

[1] Plonka, Gerlind, and Manfred Tasche. “Fast and numerically stable algorithms for discrete cosine transforms.” Linear algebra and its applications 394 (2005): 309-345.
[2] Martucci, Stephen A. “Symmetric convolution and the discrete sine and cosine transforms.” IEEE Transactions on Signal Processing 42.5 (1994): 1038-1051. pdf
[3] Suresh, K., and T. V. Sreenivas. “Linear filtering in DCT IV/DST IV and MDCT/MDST domain.” Signal Processing 89.6 (2009): 1081-1089. Abstract and full text pdf.
[4] Reju, Vaninirappuputhenpurayil Gopalan, Soo Ngee Koh, and Ing Yann Soon. “Convolution using discrete sine and cosine transforms.” IEEE Signal Processing Letters 14.7 (2007): 445. pdf
[5] Malvar, Henrique S. “Extended lapped transforms: Properties, applications, and fast algorithms.” IEEE Transactions on Signal Processing 40.11 (1992): 2703-2714.

by Andrey Filippov at January 08, 2017 01:19 AM

December 30, 2016

Harald Welte

Some thoughts on 33C3

I've just had the pleasure of attending all four days of 33C3 and have returned home with somewhat mixed feelings.

I've been a regular visitor and speaker at CCC events since 15C3 in 1998, which among other things means I'm an old man now. But I digress ;)

The event has come extremely far in those years. And to be honest, I struggle with the size. Back then, it was a meeting of like-minded hackers. You had the feeling that you know a significant portion of the attendees, and it was easy to connect to fellow hackers.

These days, both the number of attendees and the size of the event make you feel much rather that you're in general public, rather than at some meeting of fellow hackers. Yes, it is good to see that more people are interested in what the CCC (and the selected speakers) have to say, but somehow it comes at the price that I (and I suspect other old-timers) feel less at home. It feels too much like various other technology related events.

One aspect creating a certain feeling of estrangement is also the venue itself. There are an incredible number of rooms, with a labyrinth of hallways, stairs, lobbies, etc. The size of the venue simply makes it impossible to simply _accidentally_ running into all of your fellow hackers and friends. If I want to meet somebody, I have to make an explicit appointment. That is an option that exits most of the rest of the year, too.

While fefe is happy about the many small children attending the event, to me this seems somewhat alien and possibly inappropriate. I guess from teenage years onward it certainly makes sense, as they can follow the talks and participate in the workshop. But below that age?

The range of topics covered at the event also becomes wider, at least I feel that way. Topics like IT security, data protection, privacy, intelligence/espionage and learning about technology have always been present during all those years. But these days we have bloggers sitting on stage and talking about bottles of wine (seriously?).

Contrary to many, I also really don't get the excitement about shows like 'Methodisch Inkorrekt'. Seems to me like mainstream compatible entertainment in the spirit of the 1990ies Knoff Hoff Show without much potential to make the audience want to dig deeper into (information) technology.

by Harald Welte at December 30, 2016 12:00 AM

33C3 talk on dissecting cellular modems

Yesterday, together with Holger 'zecke' Freyther, I co-presented at 33C3 about Dissectiong modern (3G/4G) cellular modems.

This presentation covers some of our recent explorations into a specific type of 3G/4G cellular modems, which next to the regular modem/baseband processor also contain a Cortex-A5 core that (unexpectedly) runs Linux.

We want to use such modems for building self-contained M2M devices that run the entire application inside the modem itself, without any external needs except electrical power, SIM card and antenna.

Next to that, they also pose an ideal platform for testing the Osmocom network-side projects for running GSM, GPRS, EDGE, UMTS and HSPA cellular networks.

You can find the Slides and the Video recordings in case you're interested in more details about our work.

The results of our reverse engineering can be found in the wiki at http://osmocom.org/projects/quectel-modems/wiki together with links to the various git repositories containing related tools.

As with all the many projects that I happen to end up doing, it would be great to get more people contributing to them. If you're interested in cellular technology and want to help out, feel free to register at the osmocom.org site and start adding/updating/correcting information to the wiki.

You can e.g. help by

  • playing with the modem and documenting your findings
  • reviewing the source code released by Qualcomm + Quectel and documenting your findings
  • help us to create a working OE build with our own kernel and rootfs images as well as opkg package feeds for the modems
  • help reverse engineering DIAG and QMI protocols as well as the open source programs to interact with them

by Harald Welte at December 30, 2016 12:00 AM

December 29, 2016

Harald Welte

Contribute to Osmocom 3.5G and receive a free femtocell

In 2016, Osmocom gained initial 3.5G support with osmo-iuh and the Iu interface extensions of our libmsc and OsmoSGSN code. This means you can run your own small open source 3.5G cellular network for SMS, Voice and Data services.

However, the project needs more contributors: Become an active member in the Osmocom development community and get your nano3G femtocell for free.

I'm happy to announce that my company sysmocom hereby issues a call for proposals to the general public. Please describe in a short proposal how you would help us improving the Osmocom project if you were to receive one of those free femtocells.

Details of this proposal can be found at https://sysmocom.de/downloads/accelerate_3g5_cfp.pdf

Please contact mailto:accelerate3g5@sysmocom.de in case of any questions.

by Harald Welte at December 29, 2016 12:00 AM

December 25, 2016

ZeptoBARS

ST USBLC6-2 - USB protection chip : weekend die-shot

ST USBLC6-2 has 4 diodes and 1 Zener to protect your USB gear.
Die size 1084x547 µm.


December 25, 2016 09:01 PM

December 24, 2016

Bunnie Studios

Name that Ware December 2016

The Ware for December 2016 is below.

Wishing everyone a safe and happy holiday season!

by bunnie at December 24, 2016 05:08 PM

Winner, Name that Ware November 2016

The Ware for November 2016 is a Link Instruments MSO-28 USB scope. Congrats to Antoine for the first guess which got the model number correct, email me for your prize!

by bunnie at December 24, 2016 05:08 PM

December 23, 2016

Elphel

Measuring SSD interrupt delays


Sometimes we need to test disks connected to camera and find out if a particular model is a good candidate for in-camera stream recording application. Such disks should not only be fast enough in terms of write speed, but they should have short ‘response time’ to write commands. This ‘response time’ is basically the time between command sent to disk and a response from disk that this command has finished. The time between the two events is related to total write speed, but it can vary due to processes going on in internal disk controller. The fluctuations in disk response time can be an important parameter for high bandwidth streaming applications in embedded systems as this value allows to estimate the data buffer size needed during recording, but this may be not very critical parameter for typical PC applications as modern computers are equipped with large amount of RAM. We have not found any suitable parameter in disk specifications we had which would give us a hint for the buffer size estimation and developed a small test program for this purpose.

This program basically resembles camogm (in-camera recording program) in its operation and allows us to write repeating blocks of data containing counter value and then check the consistency of the data written. This program works directly with disk driver and collects some statistics during its operation. Disk driver, among other things, measures the time between two events: when write command is issued and when command completion interrupt from controller is received. This time can be used to measure disk write speed as the amount of data sent to disk with each command is also known. In general, this time slightly floats around its average value given that the amount of data written with each command is almost the same. But long run tests have shown that sometimes the interrupt return time after write command can be much longer then the average time.

We decided to investigate this situation in a little bit more details and tested two SSDs with our test program. The disks used for tests were SanDisk SD8SMAT128G1122 and Crucial CT250MX200SSD6, both were connected to eSATA camera port over M.2 SSD adapter. We used these disks before and they demonstrated different performance during recording. We ran camogm_test to write 3 MB blocks of data in cyclic mode. The program collected delayed interrupt times reported by driver as well as the amount of data written since the last delay event. The processed results of the test:

crucial-irq-distribution_bars_1
sandisk-irq-distribution_bars_1

Actual points of interest on these charts are circled in red and they show those delays that are noticeably different from average values. Below is the same data in table form:

Disk Average IRQ reception time, ms Standard deviation, ms Average IRQ delay time, ms Standard deviation, ms Data recorded since last IRQ delay, GB Standard deviation, GB
CT250MX200SSD6 (250 GB) 11.9 1.1 804 12.7 499.7 111.7
SD8SMAT128G1122 (128 GB) 19.3 4.8 113 6.5 231.5 11.5

The delayed interrupt times of these disks are considerably different although the difference in average interrupt times which reflect disk write speeds is not that big. It is interesting to notice that the amount of data written to disk between two consecutive interrupt delays is almost twice the total disk size. smartctl reported the increase of Runtime_Bad_Block attribute for CT250MX200SSD6 after each delay but the delays occurred each time on different LBAs. Unfortunately, SD8SMAT128G1122 does not have such parameter in its smartctl attributes and it is difficult to compare the two disks by this parameter.

by Mikhail Karpenko at December 23, 2016 01:56 AM

December 18, 2016

ZeptoBARS

LM338K - 5A LDO in TO-3 : weekend die-shot



Die size 1834x1609 µm.

You can see why this giant package is nearly obsolete these days (it's been around since 1955) - tiny crystal on a large steel case is largely limited by steel thermal conduction. Modern packages with copper base could do better with much smaller packages.


December 18, 2016 02:03 PM

December 17, 2016

ZeptoBARS

DTA143ZK - PNP BJT with bias resistors : weekend die-shot

Comparing to Infinion BCR185W there are no even bias resistors under the pads, hence larger die size (426x424 µm).


December 17, 2016 10:10 PM

Elphel

DCT type IV implementation

As we finished with the basic camera functionality and tested the first Eyesis4π built with the new 10393 system boards (it is smaller, requires less power and, is faster) we are moving forward with the in-camera image processing. We plan to combine our current camera calibration methods that require off-line post processing and the real-time image correction using the camera own FPGA resources. This project development will require switching between the actual FPGA coding and the software implementation of the same algorithms before going to the next step – software is still easier to design. The first part was in FPGA realm – it was to implement the fundamental image processing block that we already know we’ll be using and see how much of the resources it needs.

DCT type IV as a building block for in-camera image processing

We consider a small (8×8 pixel) DCT-IV to be a universal block for conditioning of the raw acquired images. Such operations as lens optical aberrations correction, color conversion (de-mosaic) in the presence of the lateral chromatic aberration, image rectification (de-warping) are easier to perform in the frequency domain using convolution-multiplication property and other algorithms.

In post-processing we use DFT (Discrete Fourier Transform) over rather large (64×64 to 512×512) tiles, but that would be too much for the in-camera processing. First is the tile size – for good lenses we do not need that large convolution kernels. Additionally we plan to combine several processing steps into one (based on our off-line post-processing experience) and so we do not need to sub-sample images – in our current software we double resolution of the raw images at the beginning and scale back the final result to reduce image degradation caused by re-sampling.

The second area where we plan to reduce computations is the replacement of the DFT with the DCT that is designed to be fed with the pure real data and so requires less arithmetic operations than DFT that processes complex input values.

Why “type IV” of the DCT?

Fig.1. Signal flow graph for DCT-IV

Fig.1. Signal flow graph for DCT-IV

We already have DCT type II implemented for the JPEG/JP4 compression, and we still needed another one. Type IV is used in audio compression because it can be converted to a modified discrete cosine transform (MDCT) – a procedure when multiple overlapped windows are processed one at a time and the results are seamlessly combined without any block artifacts that are familiar for the JPEG with low settings of the compression quality. We too need lapped transform to process large images with relatively small (much smaller than the image itself) convolution kernels, and DCT-IV is a perfect fit. 8-point DCT-IV allows to implement transformation of 16-point segments with 8-point overlap in a reversible manner – the inverse transformation of 8-point data may be converted to 16-point overlapping segments, and being added together these segments result in the original data.

There is a price though to pay for switching from DFT to DCT – the convolution-multiplication property being so straightforward in FFT gets complicated for DCT[1]. While convolving with symmetrical kernels is still simple (just the kernel has to be transformed differently, but it is anyway done off-line in our case), the arbitrary kernel convolution (or just a shift in image space needed to compensate the lateral chromatic aberration) requires both DCT-IV and DST-IV transformed data. DST-IV can be calculated with the same DCT-IV modules (just by reversing the direction of input data and alternating the sign of the output samples), but it still requires additional hardware resources and/or more processing time. Luckily it is only needed for the direct (image domain to frequency domain) transform, the inverse transform IDCT-IV (frequency to image) does not require DST. And IDCT-IV is actually the same as the direct DCT-IV, so we can again instantiate the same module.

Most of the two-dimensional transforms combine 1-d transform modules (because DCT is a separable transform), so we too started with just an 8-point DCT. There are multiple known factorizations for such algorithm[2] and we used one of them (based on BinDCT-IV) shown in Fig.1.

Fig.2. Simplified diagram of Xilinx DSP48E1 primitive (only used functionality is shown)

Fig.2. Simplified diagram of Xilinx DSP48E1 primitive (only used functionality is shown)

DSP primitive in Xilinx Zynq

This algorithm is implemented with a pair of DSP48E1[3] primitives shown in Fig.2. This primitive is flexible and allows to configure different functionality, the diagram contains only the blocks and connections used in the current project. The central part is the multiplier (signed 18 bits by signed 25 bits) with inputs from a pair of multiplexed B registers (B1 and B2, 18 bits wide) and the pre-adder AD register (25 bits). The AD register stores sum/difference of the 25-bit D-register and a multiplexed pair of 25-bit A1 and A2 registers. Any of the inputs can be replaced by zero, so AD can receive D, A1, A2, -A1, -A2, D+A1, D-A1, D+A2 and D-A2 values. Result of the multiplier (43 bits) is stored in the M register and the data from M is combined with the 48-bit output accumulator register P. Final adder can add or subtract M to/from one of the P, 48-bit C-register or just 0, so the output P register can receive +/-M, P+/-M and C+/-M. The wrapper module dsp_ma_preadd_c.v reduces the number of DSP48E1 signals and parameters to those required for the project and in addition to the primitive instance have a simple model of the DSP slice to allow simulation without the DSP48E1 source code for convenience.

Fig.3. One-dimensional 8-point DCT-IV implementation

Fig.3. One-dimensional 8-point DCT-IV implementation

8-point DCT-IV transform

The DCT-IV implementation module (Fig.3.) operates in 16 clocks cycles (2 clock periods per data item) and the input/output permutations are not included – they can be absorbed in the data source and destination memories. Current implementation does not implement correct rounding and saturation to save resources – such processing can be added to the outputs after analysis for particular application data widths. This module is not in the coder/decoder signal chain so bit-accuracy is not required.

Data is output each other cycle (so two such modules can easily be used to increase bandwidth), while input data is scrambled more, some of the items have to appear twice in a 16-cycle period. This implementation uses two of the DSP48E1 primitives connected in series. First one implements the left half of the Fig.1. graph – 3 rotators (marked R8, and two of R4), four adders, and four subtracters, The second one corresponds to the right half with R1, R5, R9, R13, four adders, and four subtracters. Two of the small memories (register files) – 2 locations before the first DSP and 4 locations before the second effectively increase the number of the DSP internal D registers. The B inputs of the DSPs receive cosine coefficients, the same ROM provides values for both DSP stages.

The diagram shows just the data paths, all the DSP control signals as well as the memories write and read addresses are generated at the defined times decoded from the 16-cycle period. The decoder is based on the spreadsheet draft of the design.

Fig.4. Two-dimensional 8x8 DCT-IV

Fig.4. Two-dimensional 8×8 DCT-IV

Two-dimensional 8×8 points DCT-IV

Next diagram Fig.4. shows a two-dimensional DCT type IV implementation using four of the 1-d 8-point DCT-IV modules described above. Input data arrives continuously in line-scan order, next 64-item block may follow either immediately or after a delay of at least 16 cycles so the pipelines phases are correctly restarted. Two of the input 8×25 memories (width can be reduced to match input data, 25 is the width of the DSP48E1 inputs) are used to re-order the input data.As each of the 1-d DCT modules require input data at more than a half cycles (see bottom of Fig.3) interleaving with the common memory for both channels is not possible, so each channel has to have a dedicated one. First of the two DCT modules convert even lines of 8 points, the other one – odd lines. The latency of the data output from the RAM in the second channel is made 1 cycle longer, so the output data from the channels also arrive at odd/even time slots and can be multiplexed to a common transpose buffer memory. Minimal size of the buffer is 2 of the 64 item pages (width can be reduced to match application requirements), but having just a two-page buffer increases the minimal pause time between blocks (if they are not immediate), with a four page buffer (and BRAM primitives are larger even if just halves are used) the minimal non-immediate delay of the 16 cycles of a 1-d module is still valid.

The second (vertical) pass is similar to the first (horizontal) one, it also has individual small memories for input data reordering and 2 output de-scrambler memories. It is possible to use a single stage, but the memory should hold at least 17 items (>16) and the primitives are 16-deep, and I believe that splitting in series makes it easier for the placer/router tools to implement the design.

Next steps

Now when the 8×8 point DCT-IV is designed and simulated the next step is to switch to the Java coding (add to our ImageJ plugin for camera calibration and image post-processing), convert calibration data to the form suitable for the future migration to FPGA and try the processing based on the chosen 8×8 DCT-IV. When satisfied with the results – continue with the FPGA coding.

References

[1] Martucci, Stephen A. “Symmetric convolution and the discrete sine and cosine transforms.” IEEE Transactions on Signal Processing 42.5 (1994): 1038-1051. pdf

[2] Britanak, Vladimir, Patrick C. Yip, and Kamisetty Ramamohan Rao. Discrete cosine and sine transforms: general properties, fast algorithms and integer approximations. Academic Press, 2010.

[3] 7 Series DSP48E1 Slice, UG479 (v1.9), Xilinx, Sep. 2016. pdf

by Andrey Filippov at December 17, 2016 07:15 AM

December 16, 2016

Harald Welte

Accessing 3GPP specs in PDF format

When you work with GSM/cellular systems, the definite resource are the specifications. They were originally released by ETSI, later by 3GPP.

The problem start with the fact that there are separate numbering schemes. Everyone in the cellular industry I know always uses the GSM/3GPP TS numbering scheme, i.e. something like 3GPP TS 44.008. However, ETSI assigns its own numbers to the specs, like ETSI TS 144008. Now in most cases, it is as simple s removing the '.' and prefixing the '1' in the beginning. However, that's not always true and there are exceptions such as 3GPP TS 01.01 mapping to ETSI TS 101855. To make things harder, there doesn't seem to be a machine-readable translation table between the spec numbers, but there's a website for spec number conversion at http://webapp.etsi.org/key/queryform.asp

When I started to work on GSM related topics somewhere between my work at Openmoko and the start of the OpenBSC project, I manually downloaded the PDF files of GSM specifications from the ETSI website. This was a cumbersome process, as you had to enter the spec number (e.g. TS 04.08) in a search window, look for the latest version in the search results, click on that and then click again for accessing the PDF file (rather than a proprietary Microsoft Word file).

At some point a poor girlfriend of mine was kind enough to do this manual process for each and every 3GPP spec, and then create a corresponding symbolic link so that you could type something like evince /spae/openmoko/gsm-specs/by_chapter/44.008.pdf into your command line and get instant access to the respective spec.

However, of course, this gets out of date over time, and by now almost a decade has passed without a systematic update of that archive.

To the rescue, 3GPP started at some long time ago to not only provide the obnoxious M$ Word DOC files, but have deep links to ETSI. So you could go to http://www.3gpp.org/DynaReport/44-series.htm and then click on 44.008, and one further click you had the desired PDF, served by ETSI (3GPP apparently never provided PDF files).

However, in their infinite wisdom, at some point in 2016 the 3GPP webmaster decided to remove those deep links. Rather than a nice long list of released versions of a given spec, http://www.3gpp.org/DynaReport/44008.htm now points to some crappy JavaScript tabbed page, where you can click on the version number and then get a ZIP file with a single Word DOC file inside. You can hardly male it any more inconvenient and cumbersome. The PDF links would open immediately in modern browsers built-in JavaScript PDF viewer or your favorite PDF viewer. Single click to the information you want. But no, the PDF links had to go and replaced with ZIP file downloads that you first need to extract, and then open in something like LibreOffice, taking ages to load the document, rendering it improperly in a word processor. I don't want to edit the spec, I want to read it, sigh.

So since the usability of this 3GPP specification resource had been artificially crippled, I was annoyed sufficiently well to come up with a solution:

  • first create a complete mirror of all ETSI TS (technical specifications) by using a recursive wget on http://www.etsi.org/deliver/etsi_ts/
  • then use a shell script that utilizes pdfgrep and awk to determine the 3GPP specification number (it is written in the title on the first page of the document) and creating a symlink. Now I have something like 44.008-4.0.0.pdf -> ts_144008v040000p.pdf

It's such a waste of resources to have to download all those files and then write a script using pdfgrep+awk to re-gain the same usability that the 3GPP chose to remove from their website. Now we can wait for ETSI to disable indexing/recursion on their server, and easy and quick spec access would be gone forever :/

Why does nobody care about efficiency these days?

If you're also an avid 3GPP spec reader, I'm publishing the rather trivial scripts used at http://git.osmocom.org/3gpp-etsi-pdf-links

If you have contacts to the 3GPP webmaster, please try to motivate them to reinstate the direct PDF links.

by Harald Welte at December 16, 2016 12:00 AM

December 15, 2016

ZeptoBARS

LM1813 - early anti-skid chip : weekend die-shot

LM1813 - anti-skid chip, was the largest analog die National Semiconductor had built to date as of 1974. It was built as a custom for a brake system vendor to Ford Motor company for use in their pickup trucks.

Die size 2234x1826 µm.



Test chips on the wafer:


Thanks for the wafers goes to Bob Miller, one of designers of this chip.

December 15, 2016 11:01 AM

December 12, 2016

Free Electrons

Linux 4.9 released, Free Electrons contributions

Linus Torvalds has released the 4.9 Linux kernel yesterday, as was expected. With 16214 non-merge commits, this is by far the busiest kernel development cycle ever, but in large part due to the merging of thousands of commits to add support for Greybus. LWN has very well summarized what’s new in this kernel release: 4.9 Merge window part 1, 4.9 Merge window part 2, The end of the 4.9 merge window.

As usual, we take this opportunity to look at the contributions Free Electrons made to this kernel release. In total, we contributed 116 non-merge commits. Our most significant contributions this time have been:

  • Free Electrons engineer Boris Brezillon, already a maintainer of the Linux kernel NAND subsystem, becomes a co-maintainer of the overall MTD subsystem.
  • Contribution of an input ADC resistor ladder driver, written by Alexandre Belloni. As explained in the commit log: common way of multiplexing buttons on a single input in cheap devices is to use a resistor ladder on an ADC. This driver supports that configuration by polling an ADC channel provided by IIO.
  • On Atmel platforms, improvements to clock handling, bug fix in the Atmel HLCDC display controller driver.
  • On Marvell EBU platforms
    • Addition of clock drivers for the Marvell Armada 3700 (Cortex-A53 based), by Grégory Clement
    • Several bug fixes and improvements to the Marvell CESA driver, for the crypto engine founds in most Marvell EBU processors. By Romain Perier and Thomas Petazzoni
    • Support for the PIC interrupt controller, used on the Marvell Armada 7K/8K SoCs, currently used for the PMU (Performance Monitoring Unit). By Thomas Petazzoni.
    • Enabling of Armada 8K devices, with support for the slave CP110 and the first Armada 8040 development board. By Thomas Petazzoni.
  • On Allwinner platforms
    • Addition of GPIO support to the AXP209 driver, which is used to control the PMIC used on most Allwinner designs. Done by Maxime Ripard.
    • Initial support for the Nextthing GR8 SoC. By Mylène Josserand and Maxime Ripard (pinctrl driver and Device Tree)
    • The improved sunxi-ng clock code, introduced in Linux 4.8, is now used for Allwinner A23 and A33. Done by Maxime Ripard.
    • Add support for the Allwinner A33 display controller, by re-using and extending the existing sun4i DRM/KMS driver. Done by Maxime Ripard.
    • Addition of bridge support in the sun4i DRM/KMS driver, as well as the code for a RGB to VGA bridge, used by the C.H.I.P VGA expansion board. By Maxime Ripard.
  • Numerous cleanups and improvements commits in the UBI subsystem, in preparation for merging the support for Multi-Level Cells NAND, from Boris Brezillon.
  • Improvements in the MTD subsystem, by Boris Brezillon:
    • Addition of mtd_pairing_scheme, a mechanism which allows to express the pairing of NAND pages in Multi-Level Cells NANDs.
    • Improvements in the selection of NAND timings.

In addition, a number of Free Electrons engineers are also maintainers in the Linux kernel, so they review and merge patches from other developers, and send pull requests to other maintainers to get those patches integrated. This lead to the following activity:

  • Maxime Ripard, as the Allwinner co-maintainer, merged 78 patches from other developers.
  • Grégory Clement, as the Marvell EBU co-maintainer, merged 43 patches from other developers.
  • Alexandre Belloni, as the RTC maintainer and Atmel co-maintainer, merged 26 patches from other developers.
  • Boris Brezillon, as the MTD NAND maintainer, merged 24 patches from other developers.

The complete list of our contributions to this kernel release:

by Thomas Petazzoni at December 12, 2016 04:06 PM

Software architecture of Free Electrons’ lab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working.

We previously introduced the lab and its integration in KernelCI, and presented its hardware infrastructure. Now is time to explain how it actually works on the software side.

Continuous integration in Linux kernel

Because of Linux’s well-known ability to run on numerous platforms and the obvious impossibility for developers to test changes on all these platforms, continuous integration has a big role to play in Linux kernel development and maintenance.

More generally, continuous integration is made up of three different steps:

  • building the software which in our case is the Linux kernel,
  • testing the software,
  • reporting the tests results;
KernelCI complete process

KernelCI complete process

KernelCI checks hourly if one of the Git repositories it tracks have been updated. If it’s the case then it builds, from the last commit, the kernel for ARM, ARM64 and x86 platforms in many configurations. Then it stores all these builds in a publicly available storage.

Once the kernel images have been built, KernelCI itself is not in charge of testing it on hardware. Instead, it delegates this work to various labs, maintained by individuals or organizations. In the following section, we will discuss the software architecture needed to create such a lab, and receive testing requests from KernelCI.

Core software component: LAVA

At this moment, LAVA is the only supported software by KernelCI but note that KernelCI offers an API, so if LAVA does not meet your needs, go ahead and make your own!

What is LAVA?

LAVA is a self-hosted software, organized in a server-dispatcher model, for controlling boards, to automate boot, bootloader and user-space testing. The server receives jobs specifying what to test, how and on which boards to run those tests, and transmits those jobs to the dispatcher linked to the specified board. The dispatcher applies all modifications on the kernel image needed to make it boot on the said board and then fully interacts with it through the serial.

Since LAVA has to fully and autonomously control boards, it needs to:

  • interact with the board through serial connection,
  • control the power supply to reset the board in case of a frozen kernel,
  • know the commands needed to boot the kernel from the bootloader,
  • serve files (kernel, DTB, rootfs) to the board.

The first three requirements are fulfilled by LAVA thanks to per-board configuration files. The latter is done by the LAVA dispatcher in charge of the board, which downloads files specified in the job and copies them to a directory accessible by the board through TFTP.

LAVA organizes the lab in devices and device types. All identical devices are from the same device type and share the same device type configuration file. It contains the set of bootloader instructions to boot the kernel (e.g.: how and where to load files) and the bootloader configuration (e.g.: can it boot zImages or only uImages). A device configuration file stores the commands run by a dispatcher to interact with the device: how to connect to serial, how to power it on and off. LAVA interacts with devices via external tools: it has support for conmux or telnet to communicate via serial and power commands can be executed by custom scripts (pdudaemon for example).

Control power supply

Some labs use expensive Switched PDUs to control the power supply of each board but, as discussed in our previous blog post we went for several Devantech ETH008 Ethernet-controlled relay boards instead.

Linaro, the organization behind LAVA, has also developed a software for controlling power supplies of each board, called pdudaemon. We added support for most Devantech relay boards to pdudaemon.

Connect to serial

As advised in LAVA’s installation guide, we went with telnet and ser2net to connect the serial port of our boards. Ser2net basically opens a Linux device and allows to interact with it through a TCP socket on a defined port. A LAVA dispatcher will then launch a telnet client to connect to a board’s serial port. Because of the well-known fact that Linux devices name might change between reboots, we had to use udev rules in order to guarantee the serial we connect to is the one we want to connect to.

Actual testing

Now that LAVA knows how to handle devices, it has to run jobs on those devices. LAVA jobs contain which images to boot (kernel, DTB, rootfs), what kind of tests to run when in user space and where to find them. A job is strongly linked to a device type since it contains the kernel and DTB specifically built for this device type.

Those jobs are submitted to the different labs by the KernelCI project. To do so, KernelCI uses a tool called lava-ci. Amongst other things, this tool contains a big table of the supported platforms, associating the Device Tree name with the corresponding hardware platform name. This way, when a new kernel gets built by KernelCI, and produces a number of Device Tree Blobs (.dtb files), lava-ci knows what are the corresponding hardware platforms to run the kernel on. It submits the jobs to all the labs, which will then only run the tests for which they have the necessary hardware platform. We have contributed a number of patches to lava-ci, adding support for the new platforms we had in our lab.

LAVA overall architecture

Reporting test results

After KernelCI has built the kernel, sent jobs to contributing labs and LAVA has run the jobs, KernelCI will then get the tests results from the labs, aggregate them on its website and notify maintainers of errors via a mailing list.

Challenges encountered

As in any project, we stumbled on some difficulties. The biggest problems we had to take care of were board-specific problems.

Some boards like the Marvell RD-370 need a rising edge on a pin to boot, meaning we cannot avoid pressing the reset button between each boot. To work out this problem, we had to customize the hardware (swap resistors) to bypass this limitation.

Some other boards lose their serial connection. Some lose it when resetting their power but recover it after a few seconds, problem we found acceptable to solve by infinitely reconnecting to the serial. However, we still have a problem with a few boards which randomly close their serial connection without any reason. After that, we are able to connect to the serial connection again but it does not send any character. The only way to get it to work again is to physically re-plug the cable used by the serial connection. Unfortunately, we did not find yet a way to solve this bug.

The Linux kernel of our server refused to bind more than 13 USB devices when it was time to create a second drawer of boards. After some research, we found out the culprit was the xHCI driver. In modern computers, it is possible to disable xHCI support in the BIOS but this option was not present in our server’s BIOS. The solution was to rebuild and install a kernel for the server without the xHCI driver compiled. From that day, the number of USB devices is limited to 127 as in the USB specification.

Conclusion

We have now 35 boards in our lab, with some being the only ones represented in KernelCI. We encourage anyone, hobbyists or companies, to contribute to the effort of bringing continuous integration of the Linux kernel by building your own lab and adding as many boards as you can.

Interested in becoming a lab? Follow the guide!

by Quentin Schulz at December 12, 2016 01:05 PM

December 07, 2016

Harald Welte

Open Hardware IEEE 802.15.4 adapter "ATUSB" available again

Many years ago, in the aftermath of Openmoko shutting down, fellow former Linux kernel hacker Werner Almesberger was working on an IEEE 802.15.4 (WPAN) adapter for the Ben Nanonote.

As a spin-off to that, the ATUSB device was designed: A general-purpose open hardware (and FOSS firmware + driver) IEEE 802.15.4 adapter that can be plugged into any USB port.

/images/atusb.jpg

This adapter has received a mainline Linux kernel driver written by Werner Almesberger and Stefan Schmidt, which was eventually merged into mainline Linux in May 2015 (kernel v4.2 and later).

Earlier in 2016, Stefan Schmidt (the current ATUSB Linux driver maintainer) approached me about the situation that ATUSB hardware was frequently asked for, but currently unavailable in its physical/manufactured form. As we run a shop with smaller electronics items for the wider Osmocom community at sysmocom, and we also frequently deal with contract manufacturers for low-volume electronics like the SIMtrace device anyway, it was easy to say "yes, we'll do it".

As a result, ready-built, programmed and tested ATUSB devices are now finally available from the sysmocom webshop

Note: I was never involved with the development of the ATUSB hardware, firmware or driver software at any point in time. All credits go to Werner, Stefan and other contributors around ATUSB.

by Harald Welte at December 07, 2016 12:00 AM

December 06, 2016

Harald Welte

The IT security culture, hackers vs. industry consortia

In a previous life I used to do a lot of IT security work, probably even at a time when most people had no idea what IT security actually is. I grew up with the Chaos Computer Club, as it was a great place to meet people with common interests, skills and ethics. People were hacking (aka 'doing security research') for fun, to grow their skills, to advance society, to point out corporate stupidities and to raise awareness about issues.

I've always shared any results worth noting with the general public. Whether it was in RFID security, on GSM security, TETRA security, etc.

Even more so, I always shared the tools, creating free software implementations of systems that - at that time - were very difficult to impossible to access unless you worked for the vendors of related device, who obviously had a different agenda then to disclose security concerns to the general public.

Publishing security related findings at related conferences can be interpreted in two ways:

On the one hand, presenting at a major event will add to your credibility and reputation. That's a nice byproduct, but that shouldn't be the primarily reason, unless you're some kind of a egocentric stage addict.

On the other hand, presenting findings or giving any kind of presentation or lecture at an event is a statement of support for that event. When I submit a presentation at a given event, I think carefully if that topic actually matches the event.

The reason that I didn't submit any talks in recent years at CCC events is not that I didn't do technically exciting stuff that I could talk about - or that I wouldn't have the reputation that would make people consider my submission in the programme committee. I just thought there was nothing in my work relevant enough to bother the CCC attendees with.

So when Holger 'zecke' Freyther and I chose to present about our recent journeys into exploring modern cellular modems at the annual Chaos Communications Congress, we did so because the CCC Congress is the right audience for this talk. We did so, because we think the people there are the kind of community of like-minded spirits that we would like to contribute to. Whom we would like to give something back, for the many years of excellent presentations and conversations had.

So far so good.

However, in 2016, something happened that I haven't seen yet in my 17 years of speaking at Free Software, Linux, IT Security and other conferences: A select industry group (in this case the GSMA) asking me out of the blue to give them the talk one month in advance at a private industry event.

I could hardly believe it. How could they? Who am I? Am I spending sleepless nights and non-existing spare time into security research of cellular modems to give a free presentation to corporate guys at a closed industry meeting? The same kind of industries that create the problems in the first place, and who don't get their act together in building secure devices that respect people's privacy? Certainly not. I spend sleepless nights of hacking because I want to share the results with my friends. To share it with people who have the same passion, whom I respect and trust. To help my fellow hackers to understand technology one step more.

If that kind of request to undermine the researcher/authors initial publication among friends is happening to me, I'm quite sure it must be happening to other speakers at the 33C3 or other events, too. And that makes me very sad. I think the initial publication is something that connects the speaker/author with his audience.

Let's hope the researchers/hackers/speakers have sufficiently strong ethics to refuse such requests. If certain findings are initially published at a certain conference, then that is the initial publication. Period. Sure, you can ask afterwards if an author wants to repeat the presentation (or a similar one) at other events. But pre-empting the initial publication? Certainly not with me.

I offered the GSMA that I could talk on the importance of having FOSS implementations of cellular protocol stacks as enabler for security research, but apparently this was not to their interest. Seems like all they wanted is an exclusive heads-up on work they neither commissioned or supported in any other way.

And btw, I don't think what Holger and I will present about is all that exciting in the first place. More or less the standard kind of security nightmares. By now we are all so numbed down by nobody considering security and/or privacy in design of IT systems, that is is hardly any news. IoT how it is done so far might very well be the doom of mankind. An unstoppable tsunami of insecure and privacy-invading devices, built on ever more complex technology with way too many security issues. We shall henceforth call IoT the Industry of Thoughtlessness.

by Harald Welte at December 06, 2016 07:00 AM

DHL zones and the rest of the world

I typically prefer to blog about technical topics, but the occasional stupidity in every-day (business) life is simply too hard to resist.

Today I updated the shipping pricing / zones in the ERP system of my company to predict shipping rates based on weight and destination of the package.

Deutsche Post, the German Postal system is using their DHL brand for postal packages. They divide the world into four zones:

  • Zone 1 (EU)
  • Zone 2 (Europe outside EU)
  • Zone 3 (World)

You would assume that "World" encompasses everything that's not part of the other zones. So far so good. However, I then stumbled about Zone 4 (rest of world). See for yourself:

/images/dhl-rest_of_world.png

So the World according to DHL is a very small group of countries including Libya and Syria, while countries like Mexico are rest of world

Quite charming, I wonder which PR, communicatoins or marketing guru came up with such a disqualifying name. Maybe they should hve called id 3rd world and 4th world instead? Or even discworld?

by Harald Welte at December 06, 2016 06:50 AM

December 03, 2016

ZeptoBARS

CD4049 - hex CMOS inverter : weekend die-shot

On CD4049 you can see 6 independent inverters, each having 3 inverters connected in series with increasing gate width on each stage - this helps to achieve higher speed and lower input capacitance. Gate length is 6µm, so it is probably the slowest CMOS circuit one can ever see. Gates are metal (i.e. not self-aligned silicon) which are again the slower type at that time.

Die size 722x552 µm.


December 03, 2016 03:27 PM

Bunnie Studios

NeTV2 FPGA Reference Design

A complex system like NeTV2 consists of several layers of design. About a month ago, we pushed out the PCB design. But a PCB design alone does not a product make: there’s an FPGA design, firmware for the on-board MCU, host drivers, host application code, and ultimately layers in the cloud and beyond. We’re slowly working our way from the bottom up, assembling and validating the full system stack. In this post, we’ll talk briefly about the FPGA design.

This design targets an Artix-7 XC7A50TCSG325-2 FPGA. As such, I opted to use Xilinx’s native Vivado design flow, which is free to download and use, but not open source. One of Vivado’s more interesting features is a hybrid schematic/TCL design flow. The designs themselves are stored as an XML file, and dynamically rendered into a schematic. The schematic itself can then be updated and modified by using either the GUI or TCL commands. This hybrid flow strikes a unique balance between the simplicity and intuitiveness of designing with a schematic, and the power of text-based scripting.


Above: top-level schematic diagram of the NeTV2 FPGA reference design as rendered by the Vivado tools

However, the main motivation to use Vivado is not the design entry methodology per se. Rather, it is Vivado’s tight integration with the AXI IP bus standard. Vivado can infer AXI bus widths, address space mappings, and interconnect fabric topology based on the types of blocks that are being strung together. The GUI provides some mechanisms to tune parameters such as performance vs. area, but it’s largely automatic and does the right thing. Being able to mix and match IP blocks with such ease can save months of design effort. However, the main downside of using Vivado’s native IP blocks is they are area-inefficient; for example, the memory-mapped PCI express block includes an area-intensive slave interface which is synthesized, placed, and routed — even if the interface is totally unused. Fortunately many of the IP blocks compile into editable verilog or VHDL, and in the case of the PCI express block the slave interface can be manually excised after block generation, but prior to synthesis, reclaiming the logic area of that unused interface.

Using Vivado, I’m able to integrate a PCI-express interface, AXI memory crossbar, and DDR3 memory controller with just a few minutes of effort. With similar ease, I’ve added in some internal AXI-mapped GPIO pins to provide memory-mapped I/O within the FPGA, along with a video DMA master which can format data from the DDR3 memory and stream it out as raster-synchronous RGB pixel data. All told, after about fifteen minutes of schematic design effort I’m positioned to focus on coding my application, e.g. the HDMI decode/encode, HDCP encipher, key extraction, and chroma key blender.

Below is the “hierarchical” view of this NeTV2 FPGA design. About 75% of the resources are devoted to the Vivado IP blocks, and about 25% to the custom NeTV application logic; altogether, the design uses about 72% of the XC7A50T FPGA’s LUT resources. A full-custom implementation of the Vivado IP blocks would save a significant amount of area, as well as be more FOSS-friendly, but it would also take months to implement an equivalent level of functionality.

Significantly, the FPGA reference design shared here implements only the “basic” NeTV chroma-key based blending functionality, as previously disclosed here. Although we would like to deploy more advanced features such as alpha blending, I’m unable to share any progress because this operation is generally prohibited under Section 1201 of the DMCA. With the help of the EFF, I’m suing the US government for the right to disclose and share these developments with the general public, but until then, my right to express these ideas is chilled by Section 1201.

by bunnie at December 03, 2016 07:17 AM

December 02, 2016

Free Electrons

Buildroot 2016.11 released, Free Electrons contributions

Buildroot LogoThe 2016.11 release of Buildroot has been published on November, 30th. The release announcement, by Buildroot maintainer Peter Korsgaard, gives numerous details about the new features and updates brought by this release. This new release provides support for using multiple BR2_EXTERNAL directories, gives some important updates to the toolchain support, adds default configurations for 9 new hardware platforms, and 38 new packages were added.

On a total of 1423 commits made for this release, Free Electrons contributed a total of 253 commits:

$ git shortlog -sn --author=free-electrons 2016.08..2016.11
   142  Gustavo Zacarias
   104  Thomas Petazzoni
     7  Romain Perier

Here are the most important contributions we did:

  • Romain Perier contributed a package for the AMD Catalyst proprietary driver. Such drivers are usually not trivial to integrate, so having a ready-to-use package in Buildroot will really make it easier for Buildroot users who use hardware with an AMD/ATI graphics controller. This package provides both the X.org driver and the OpenGL implementation. This work was sponsored by one of Free Electrons customer.
  • Gustavo Zacarias mainly contributed a large set of patches that do a small update to numerous packages, to make sure the proper environment variables are passed. This is a preparation change to bring top-level parallel build in Buildroot. This work was also sponsored by another Free Electrons customer.
  • Thomas Petazzoni did contributions in various areas:
    • Added a DEVELOPERS file to the tree, to reference which developers are interested by which architectures and packages. Not only it allows the developers to be Cc’ed when patches are sent on the mailing list (like the get_maintainers script does), but it also used by Buildroot autobuilder infrastructure: if a package fails to build, the corresponding developer is notified by e-mail.
    • Misc updates to the toolchain support: switch to gcc 5.x by default, addition of gcc patches needed to fix various issues, etc.
    • Numerous fixes for build issues detected by Buildroot autobuilders

In addition to contributing 104 commits, Thomas Petazzoni also merged 1095 patches from other developers during this cycle, in order to help Buildroot maintainer Peter Korsgaard.

Finally, Free Electrons also sponsored the Buildroot project, by funding the meeting location for the previous Buildroot Developers meeting, which took place in October in Berlin, after the Embedded Linux Conference. See the Buildroot sponsors page, and also the report from this meeting. The next Buildroot meeting will take place after the FOSDEM conference in Brussels.

by Thomas Petazzoni at December 02, 2016 03:14 PM

Free Electrons at Linux.conf.au, January 2017

Linux.conf.au, which takes place every year in January in Australia or New Zealand, is a major event of the Linux community. Free Electrons already participated to this event three years ago, and will participate again to this year’s edition, which will take place from January 16 to January 20 2017 in Hobart, Tasmania.

Linux Conf Australia 2017

This time, Free Electrons CTO Thomas Petazzoni will give a talk titled A tour of the ARM architecture and its Linux support, in which he will share with LCA attendees what is the ARM architecture, how its Linux support is working, what the numerous variants of ARM processors and boards mean, what is the Device Tree, the ARM specific bootloaders, and more.

Linux.conf.au also features a number of other kernel related talks, such as the Kernel Report from Jonathan Corbet, Linux Kernel memory ordering: help arrives at last from Paul E. McKenney. The list of conferences is very impressive, and the event also features a number of miniconfs, including one on the Linux kernel.

If some of our readers located in Australia, New Zealand or neighboring countries plan on attending the conference, do not hesitate to drop us a mail so that we can meet during the event!

by Thomas Petazzoni at December 02, 2016 08:50 AM

November 29, 2016

Bunnie Studios

Name that Ware, November 2016

The Ware for November 2016 is shown below.

Happy holidays!

by bunnie at November 29, 2016 05:26 PM

Winner, Name that Ware October 2016

The Ware for October 2016 is a hard drive read head, from a 3.5″ Toshiba hard drive that I picked out of a trash heap. The drive was missing the cover which bore the model number, but based on the chips used on its logic board, the drive was probably made between 2011-2012. This photo was taken at about 40x magnification. Congrats to Jeff Epler for nailing the ware as the first guesser, email me for your prize!

by bunnie at November 29, 2016 05:26 PM

Free Electrons

Hardware infrastructure of Free Electrons’ lab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working, starting with this first blog post that details the hardware infrastructure of our lab.

Introduction

In a lab built for continuous integration, everything has to be fully automated from the serial connections to power supplies and network connections.

To gather as much information as we can get to establish the specifications of the lab, our engineers filled a spreadsheet with all boards they wanted to have in the lab and their specificities in terms of connectors used the serial port communication and power supply. We reached around 50 boards to put into our lab. Among those boards, we could distinguish two different types:

  • boards which are powered by an ATX power supply,
  • boards which are powered by different power adapters, providing either 5V or 12V.

Another design criteria was that we wanted to easily allow our engineers to take a board out of the lab or to add one. The easier the process is, the better the lab is.

Home made cabinet

Free Electrons' 8 drawers labTo meet the size constraints of Free Electrons office, we had to make the lab fit in a 100cm wide, 75cm deep and 200cm high space. In order to achieve this, we decided to build the lab as a large home made cabinet, with a number of drawers to easily access, change or replace the boards hosted in the lab. As some of our boards provide PCIe connectors, we needed to provide enough height for each drawer, and after doing a few measurements, decided that a 25cm height for our drawers would be fine. With a total height of 200cm, this gives a maximum of 8 drawers.

In addition, it turns out that most of our boards powered by ATX power supplies are rather large in size, while the ones powered by regular power adapters are usually much smaller. In order to simplify the overall design, we decided that all large boards would be grouped together on a given set of drawers, and all small boards would be grouped together on another set of drawers: i.e we would not mix large and small boards in the same drawer. With the 100cm x 75cm size limitation, this meant a drawer for small boards could host up to 8 boards, while a drawer for large boards could host up to 4 boards. From the spreadsheet containing all the boards supposed to be in the lab, we eventually decided there would be 3 large drawers for up to 12 large boards and 5 small drawers for up to 40 small or medium-sized boards.

Furthermore, since the lab will host a server and a lot of boards and power supplies, potentially producing a lot of heat, we have to keep the lab as open as it can be while making sure it is strong enough to hold the drawers. We ended up building our own cabinet, made of wood bought from the local hardware store.

We also want the server to be part of the lab. We already have a small piece of wood to strengthen the lab between the fourth and sixth drawers we could use to fix the server. We decided to give a mini-PC (NUC-like) a try, because, after all, it’s only communicating with the serial of each board and serving files to them. Thus, everything related to the server is fixed and wired behind the lab.

Make the lab autonomous

What continuous integration for the Linux kernel typically needs are control of:

  1. the power for each board
  2. serial port connection
  3. a way to send files to test, typically the kernel image and associated files

In Free Electrons lab, these different tasks are handled by a dedicated server, itself hosted in the lab.

Serial port control

Serial connections are mostly handled via USB on the server side but there are many different connectors on the target side (in our lab, we have 6 different connectors: DE9, microUSB, miniUSB, 2.54″ male pins, 2.54″ female pins and USB-B). Therefore, our server has to have a physical connection with each of the 50 boards present in the lab. The need for USB hubs is then obvious.

Since we want as few cables connecting the server and the drawers as possible, we decided to have one USB hub per drawer, be it a large drawer or a small drawer. In a small drawer, up to 8 boards can be present, meaning the hub needs at least 8 USB ports. In a large drawer, up to 4 serial connections can be needed so smaller and more common USB hubs can do the work. Since the serial connection may draw some current on the USB port, we wanted all of our USB hubs to be powered with a dedicated power supply.

All USB hubs are then connected to a main USB hub which in turn is connected to our server.

Power supply control

Our server needs to control each board’s power to be able to automatically power on or off a board. It will power on the board when it needs to test a new kernel on it and power it off at the end of the test or when the kernel has frozen or could not boot at all.

In terms of power supplies, we initially investigated using Ethernet-controlled multi-sockets (also called Switched PDU), such as this device. Unfortunately, these devices are quite expensive, and also often don’t provide the most appropriate connector to plug the cheap 5V/12V power adapters used by most boards.

So, instead, and following a suggestion from Kevin Hilman (one of KernelCI’s founder and maintainer), we decided to use regular ATX power supplies. They have the advantage of being inexpensive, and providing enough power for multiple boards and all their peripherals, potentially including hard drives or other power-hungry peripherals. ATX power supplies also have a pin, called PS_ON#, which when tied to the ground, powers up the ATX power supply. This easily allows to turn an ATX power supply on or off.

In conjunction with the ATX power supplies, we have a selected Ethernet-controlled relay board, the Devantech ETH008, which contains 8 relays that can be remote controlled over the network.

This gives us the following architecture:

  • For the drawers with large boards powered by ATX directly, we have one ATX power supply per board. The PS_ON pin from the ATX power supply is cut and rewired to the Ethernet controlled relay. Thanks to the relay, we control if PS_ON is tied to the ground or not. If it’s tied to the ground, then the board boots, when it’s untied from the ground, the board is powered off.
  • For the drawers with small boards, we have a single ATX power supply per drawer. The 12V and 5V rails from the ATX power supply are then dispatched through the 8-relay board, then connected to the appropriate boards, through DC barrel or mini-USB/micro-USB cables, depending on the board. The PS_ON is always tied to the ground, so those ATX power supplies are constantly on.

In addition, we have added a bit of over-voltage protection, by adding transient-voltage-suppression diodes for each voltage output in each drawer. These diodes will absorb all the voltage when it exceeds the maximum authorized value and explode, and are connected in parallel in the circuit to protect.

Network connectivity

As part of the continuous integration process, most of our boards will have to fetch the Linux kernel to test (and potentially other related files) over the network through TFTP. So we need all boards to be connected to the server running the continuous integration software.

Since a single 52 port switch is both fairly expensive, and not very convenient in terms of wiring in our situation, we instead opted for adding 8-port Gigabit switches to each drawer, all of them being connected via a central 16-port Gigabit switch located at the back of the home made cabinet. This central switch not only connects the per-drawer switches, but also the server running the continuous integration software, and the wider Internet.

In-drawer architecture: large boards

A drawer designed for large boards, powered by an ATX power supply contains the following components:

  • Up to four boards
  • Four ATX power-supplies, with their PS_ON# connected to an 8-port relay controller. Only 4 of the 8 ports are used on the relay.
  • One 8-port Ethernet-controlled relay board.
  • One 4-port USB hub, connecting to the serial ports of the four boards.
  • One 8-port Ethernet switch, with 4 ports used to connect to the boards, one port used to connect to the relay board, and one port used for the upstream link.
  • One power strip to power the different components.
Large drawer example scheme

Large drawer example scheme

Large drawer in the lab

Large drawer in the lab

In drawer architecture: small boards

A drawer designed for small boards contains the following components:

  • Up to eight boards
  • One ATX power-supply, with its 5V and 12V rails going through the 8-port relay controller. All ports in the relay are used when 8 boards are present.
  • One 8-port Ethernet-controlled relay board.
  • One 10-port USB hub, connecting to the serial ports of the eight boards.
  • Two 8-port Ethernet switches, connecting the 8 boards, the relay board and an upstream link.
  • One power strip to power the different components.
Small drawer example scheme

Small drawer example scheme

Small drawer in the lab

Small drawer in the lab

Server

At the back of the home made cabinet, a mini PC runs the continuous integration software, that we will discuss in a future blog post. This mini PC is connected to:

  • A main 16-port Gigabit switch, itself connected to all the Gigabit switches in the different drawers
  • A main USB hub, itself connected to all the USB hubs in the different drawers

As expected, this allows the server to control the power of the different boards, access their serial port, and provide network connectivity.

Detailed component list

If you’re interested by the specific components we’ve used for our lab, here is the complete list, with the relevant links:

Conclusion

Hopefully, sharing these details about the hardware architecture of our board farm will help others to create a similar automated testing infrastructure. We are of course welcoming feedback on this hardware architecture!

Stay tuned for our next blog post about the software architecture of our board farm.

by Quentin Schulz at November 29, 2016 01:31 PM

November 28, 2016

ZeptoBARS

JCST CJ431 : weekend die-shot

CJ431 is another implementation of 431 shunt voltage reference manufactured by Jiangsu Changjiang Electronics Technology (JCST).
Die size 620x631 µm.


November 28, 2016 06:10 AM

November 27, 2016

Harald Welte

Ten years anniversary of Openmoko

In 2006 I first visited Taiwan. The reason back then was Sean Moss-Pultz contacting me about a new Linux and Free Software based Phone that he wanted to do at FIC in Taiwan. This later became the Neo1973 and the Openmoko project and finally became part of both Free Software as well as smartphone history.

Ten years later, it might be worth to share a bit of a retrospective.

It was about building a smartphone before Android or the iPhone existed or even were announced. It was about doing things "right" from a Free Software point of view, with FOSS requirements going all the way down to component selection of each part of the electrical design.

Of course it was quite crazy in many ways. First of all, it was a bunch of white, long-nosed western guys in Taiwan, starting a company around Linux and Free Software, at a time where that was not really well-perceived in the embedded and consumer electronics world yet.

It was also crazy in terms of the many cultural 'impedance mismatches', and I think at some point it might even be worth to write a book about the many stories we experienced. The biggest problem here is of course that I wouldn't want to expose any of the companies or people in the many instances something went wrong. So probably it will remain a secret to those present at the time :/

In any case, it was a great project and definitely one of the most exciting (albeit busy) times in my professional career so far. It was also great that I could involve many friends and FOSS-compatriots from other projects in Openmoko, such as Holger Freyther, Mickey Lauer, Stefan Schmidt, Daniel Willmann, Joachim Steiger, Werner Almesberger, Milosch Meriac and others. I am happy to still work on a daily basis with some of that group, while others have moved on to other areas.

I think we all had a lot of fun, learned a lot (not only about Taiwan), and were working really hard to get the hardware and software into shape. However, the constantly growing scope, the [for western terms] quite unclear and constantly changing funding/budget situation and the many changes in direction have ultimately lead to missing the market opportunity. At the time the iPhone and later Android entered the market, it was too late for a small crazy Taiwanese group of FOSS-enthusiastic hackers to still have a major impact on the landscape of Smartphones. We tried our best, but in the end, after a lot of hype and publicity, it never was a commercial success.

What's more sad to me than the lack of commercial success is also the lack of successful free software that resulted. Sure, there were some u-boot and linux kernel drivers that got merged mainline, but none of the three generations of UI stacks (GTK, Qt or EFL based), nor the GSM Modem abstraction gsmd/libgsmd nor middleware (freesmartphone.org) has manage to survive the end of the Openmoko company, despite having deserved to survive.

Probably the most important part that survived Openmoko was the pioneering spirit of building free software based phones. This spirit has inspired pure volunteer based projects like GTA04/Openphoenux/Tinkerphone, who have achieved extraordinary results - but who are in a very small niche.

What does this mean in practise? We're stuck with a smartphone world in which we can hardly escape any vendor lock-in. It's virtually impossible in the non-free-software iPhone world, and it's difficult in the Android world. In 2016, we have more Linux based smartphones than ever - yet we have less freedom on them than ever before. Why?

  • the amount of hardware documentation on the processors and chipsets to day is typically less than 10 years ago. Back then, you could still get the full manual for the S3C2410/S3C2440/S3C6410 SoCs. Today, this is not possible for the application processors of any vendor
  • the tighter integration of application processor and baseband processor means that it is no longer possible on most phone designs to have the 'non-free baseband + free application processor' approach that we had at Openmoko. It might still be possible if you designed your own hardware, but it's impossible with any actually existing hardware in the market.
  • Google blurring the line between FOSS and proprietary code in the Android OS. Yes, there's AOSP - but how many features are lacking? And on how many real-world phones can you install it? Particularly with the Google Nexus line being EOL'd? One of the popular exceptions is probably Fairphone2 with it's alternative AOSP operating system, even though that's not the default of what they ship.
  • The many binary-only drivers / blobs, from the graphics stack to wifi to the cellular modem drivers. It's a nightmare and really scary if you look at all of that, e.g. at the binary blob downloads for Fairphone2 to get an idea about all the binary-only blobs on a relatively current Qualcomm SoC based design. That's compressed 70 Megabytes, probably as large as all of the software we had on the Openmoko devices back then...

So yes, the smartphone world is much more restricted, locked-down and proprietary than it was back in the Openmoko days. If we had been more successful then, that world might be quite different today. It was a lost opportunity to make the world embrace more freedom in terms of software and hardware. Without single-vendor lock-in and proprietary obstacles everywhere.

by Harald Welte at November 27, 2016 03:00 PM

November 24, 2016

Harald Welte

Open Hardware Multi-Voltage USB UART board released

During the past 16 years I have been playing a lot with a variety of embedded devices.

One of the most important tasks for debugging or analyzing embedded devices is usually to get access to the serial console on the UART of the device. That UART is often exposed at whatever logic level the main CPU/SOC/uC is running on. For 5V and 3.3V that is easy, but for ever more and more unusual voltages I always had to build a custom cable or a custom level shifter.

In 2016, I finally couldn't resist any longer and built a multi-voltage USB UART adapter.

This board exposes two UARTs at a user-selectable voltage of 1.8, 2.3, 2.5, 2.8, 3.0 or 3.3V. It can also use whatever other logic voltage between 1.8 and 3.3V, if it can source a reference of that voltage from the target embedded board.

/images/mv-uart-front.jpg

Rather than just building one for myself, I released the design as open hardware under CC-BY-SA license terms. Full schematics + PCB layout design files are available. For more information see http://osmocom.org/projects/mv-uart/wiki

In case you don't want to build it from scratch, ready-made machine assembled boards are also made available from http://shop.sysmocom.de/products/multi-voltage-usb-dual-uart

by Harald Welte at November 24, 2016 11:00 PM

Open Hardware miniPCIe WWAN modem USB breakout board released

There are plenty of cellular modems on the market in the mPCIe form factor.

Playing with such modems is reasonably easy, you can simply insert them in a mPCIe slot of a laptop or an embedded device (soekris, pc-engines or the like).

However, many of those modems actually export interesting signals like digital PCM audio or UART ports on some of the mPCIe pins, both in standard and in non-standard ways. Those signals are inaccessible in those embedded devices or in your laptop.

So I built a small break-out board which performs the basic function of exposing the mPCIe USB signals on a USB mini-B socket, providing power supply to the mPCIe modem, offering a SIM card slot at the bottom, and exposing all additional pins of the mPCIe header on a standard 2.54mm pitch header for further experimentation.

/images/mpcie-breakout-front.jpg

The design of the board (including schematics and PCB layout design files) is available as open hardware under CC-BY-SA license terms. For more information see http://osmocom.org/projects/mpcie-breakout/wiki

If you don't want to build your own board, fully assembled and tested boards are available from http://shop.sysmocom.de/products/minipcie-wwan-modem-usb-break-out-board

by Harald Welte at November 24, 2016 11:00 PM

November 13, 2016

ZeptoBARS

Infineon BCR185W - PNP BJT with bias resistors : weekend die-shot

Infineon BCR185W is a 0.1A PNP BJT. Bias resisors are 10 kΩ and 47 kΩ according to datasheet.
Die size 395x285 µm.


November 13, 2016 05:19 PM

November 08, 2016

Free Electrons

Slides and videos from the Embedded Linux Conference Europe 2016

Last month, the entire Free Electrons engineering team attended the Embedded Linux Conference Europe in Berlin. The slides and videos of the talks have been posted, including the ones from the seven talks given by Free Electrons engineers:

  • Alexandre Belloni presented on ASoC: Supporting Audio on an Embedded Board, slides and video.
  • Boris Brezillon presented on Modernizing the NAND framework, the big picture, slides and video.
  • Boris Brezillon, together with Richard Weinberger from sigma star, presented on Running UBI/UBIFS on MLC NAND, slides and video.
  • Grégory Clement presented on Your newer ARM64 SoC Linux check list, slides and video.
  • Thomas Petazzoni presented on Anatomy of cross-compilation toolchains, slides and video.
  • Maxime Ripard presented on Supporting the camera interface on the C.H.I.P, slides and video.
  • Quentin Schulz and Antoine Ténart presented on Building a board farm: continuous integration and remote control, slides and video.

by Thomas Petazzoni at November 08, 2016 03:43 PM

November 05, 2016

ZeptoBARS

DVD photosensor : weekend die-shot

This is unidentified photo-sensor from DVD-RW drive. Most of the work is done by middle quad - it can receive the signal, track focus (via astigmatic focusing) and follow the track. Additional quads are probably here to improve tracking, they are not used as full quads - there are fewer outputs for left and right quads.

Die size 1839x1635 µm.



Closer look at photo-diodes:

November 05, 2016 08:55 PM

November 04, 2016

Village Telco

MP2 AWD – ‘All Wheel Drive’ Edition

pan_6The MP2 AWD “All Wheel Drive” edition is now available for order.  The MP2 AWD represents a big step forward for the Mesh Potato.   It is based on the same core as the MP2 Phone and is packaged in an outdoor enclosure with additional features and capabilities, most notably a second radio capable of 2T2R (MIMO) operation on 2.4 and 5GHz bands.  It also has an internal USB port as well as an SD card slot.  This opens up the possibilities for innovation.  The SD slot can host cached content such as World Possible’s Rachel Offline project or any locally important content.  The USB port is available for a variety of uses such as 3G/4G modem for backhaul or backup.

The MP2 AWD is also easier to deploy than previous models as power, data, and telephony have been integrated into a single ethernet connection thanks to the PoE/TL adaptor that is shipped with the device.  Now both phone, data, and power are all served via a single cable.

The default user setup for the MP2 AWD is to use the 2.4GHz radio for local hotspot access and the 5GHz radio to create the backbone network on the mesh but it can be configured to suit a variety of scenarios.

The MP2 AWD has the following features:

  • Everything already included in MP2 Phone including:
    • Atheros AR9331 SoC with a 2.4GHz 802.11n 1×1 router in a single chip
    • Internal antenna for 2.4GHz operation
    • FXS port based on Silicon Labs Si3217x chipset
    • 16/64MB flash/ram memory configuration
    • Two 100Base-T Ethernet ports
    • High-speed UART for console support
  • A second radio module based on the MediaTek/Ralink RT5572 chipset which supports IEEE 802.11bgn 2T2R (2×2 MIMO) operation on 2.4 and 5 GHz bands.
  • Internal SD card slot capable of supporting local content serving, data caching, and general data storage applications.
  • Internal USB port which can be used for a memory device , GSM 3/4G dongle or other USB devices.
  • PoE/TL adaptor which will carry Voice/Data/Power via a single Cat5/6 cable to the MP2 AWD. Similar to a passive PoE connector but also carries voice telephone line connection allowing phone to be plugged in remotely from MP2 AWD

Available for order now on the Village Telco store.

by steve at November 04, 2016 07:46 PM

November 01, 2016

Bunnie Studios

NeTV2 Tech Details Live

Alphamax LLC now has details of the NeTV2 live, including links to preliminary schematics and PCB source files.

The key features of NeTV2 include:

  • mPCIE v2.0 (5Gbps x1 lane) add-in card format
  • Support for full 1080p60 video
  • Artix-7 FPGA
  • FPGA “hack port” breaking out 3x spare GTP transceiver pairs
  • 512 MB of DDR3-800 @ 32-bit wide memory for frame buffering

I adopted an add-in card format to allow end users to pick the cost/performance trade-off that suited their application the best. Some users require only a text overlay (NeTV’s original design scenario); but others wanted to blend HD video and 3D graphics, which would require a substantially more powerful and expensive CPU. An add-in card allows users to plug into anything from an economical $60 all-in-one, to a fully loaded gaming machine. The kosagi forum has an open thread for NeTV2 discussion.

As noted previously, we are currently seeking legal clarity on the suite of planned features for the product, including highly requested features such as alpha blending which require access to the descrambled video stream.

by bunnie at November 01, 2016 09:35 AM

October 30, 2016

Bunnie Studios

Name that Ware, October 2016

The Ware for October 2016 is shown below:

I like this one because not only is it exquisitely engineered, it’s also aesthetically pleasing.

Sorry for the relative radio silence on the blog — been very heads down the past couple months grinding through several major projects, including my latest book, “The Hardware Hacker”, which is on-track to hit shelves in a couple of months!

by bunnie at October 30, 2016 04:14 PM

Winner, Name that Ware September 2016

The ware for September 2016 is a ColorVision Sypder-series monitor color calibrator.

Congrats to North-X for naming the ware, email me for your prize!

by bunnie at October 30, 2016 04:11 PM

October 29, 2016

ZeptoBARS

K140UD2B - Soviet opamp : weekend die-shot

K140UD2B is an old Soviet opamp without internal frequency compensation. Similar to RCA CA3047T. ICs manufactured in ~1982 have bare die in metal can, ones manufactured in 1988 - have some protective overcoat inside metal can (which is quite unusual).
Die size 1621x1615 µm.


October 29, 2016 07:46 PM

October 28, 2016

Mirko Vogt, nanl.de

intel 540s SSD fail

My intel SSD failed. Hard. As in: its content got wiped. But before getting way too theatrical, let’s stick to the facts first.

I upgraded my Lenovo ThinkPad X1 Carbon with a bigger SSD in the late summer this year — a 1TB intel 540s (M.2).

The BIOS of ThinkPads (and probably other brands as well) offer to secure your drive with an ATA password. This feature is part of the ATA specification and was already implemented and used back in the old IDE times (remember the X-BOX 1?).

With such an ATA password set, all read/write commands to the drive will be ignored until the drive gets unlocked. There’s some discussion about whether ATA passwords should or shouldn’t be used — personally I like the idea of $person not being able to just pull out my drive, modify its unencrypted boot record and put it back into my computer without me noticing.

In regard of current SSDs the ATA password doesn’t just lock access to the drive but also plays part in the FDE (full disk encryption) featured by modern SSDs — but back to what actually happened…

As people say, it’s good practice to frequently(TM) change passwords. So I did with my ATA password.

And then it happened. My data was gone. All of it. I could still access the SSD with the newly set password but it only contained random data. Even the first couple of KB, which were supposed to contain the partition table as well as unencrypted boot code, magically seem to have been replaced with random data. Perfectly random data.

So, what happened? Back to FDE of recent SSDs: They perform encryption on data written to the drive (decryption on reads, respectively) — no matter if you want it or not.
Encrypted with a key stored on the device — with no easy way of reading it out (hence no backup). This is happening totally transparently; the computer the device is connected to doesn’t have to care about that at all.

And the ATA password is used to encrypt the key the actual data on the drive is encrypted with. Password encrypts key encrypts data.

Back to my case: No data, just garbage. Perfectly random garbage. First idea on what happened, as obvious as devastating: the data on the drive gets read and decrypted with a different key than it initially got written and encrypted with. If that’s indeed the case, my data is gone.

This behaviour is actually advertised as a feature. intel calls it “Secure Erase“. No need to override your drive dozens of times like in the old days — therewith ensuring the data is irreversible vanished in the end. No, just wipe the key your data is encrypted with and done. And exactly this seems to have happened to me. I am done.

Fortunately I made backups. Some time ago. Quite some time ago. Of a few directories. Very few. Swearing. Tears. I know, I know, I don’t deserve your sympathies (but I’d still appreciate!).

Anger! Whose fault is it?! Who to blame?!

Let’s check the docs on ATA passwords, which appear to be very clear — from the official Lenovo FAQ:

“Will changing the Master or User hard drive password change the FDE key?”
– “No. The hard drive passwords have no effect on the encryption key. The passwords can safely be changed without risking loss of data.”

Not my fault! Yes! Wait, another FAQ entry says:

“Can the encryption key be changed?”
– “The encryption key can be regenerated within the BIOS, however, doing so will make all data inaccessible, effectively wiping the drive. To generate a new key, use the option listed under Security -> Disk Encryption HDD in the system BIOS.”

Double-checking the BIOS if I unintentionally told my BIOS to change the FDE key. No, I wasn’t even able to find such a setting.

Okay — intermediate result: either buggy BIOS telling my SSD to (re)generate the encryption key (and therewith “Secure Erase” everything on it) or buggy SSD controller, deciding to alter the key at will.

Google! Nothing. Frightening reports about the disastrous “8MB”-bug on the earlier series 320 devices popped up. But nothing on series 540s.

If nothing helps and/or there’s nobody to blame: go on Twitter!

Some Ping-Pong:

Then…

Wait, what?! That’s a known issue? I didn’t find a damn thing in the whole internets! Tell me more!

And to my surprise – they did. For a minute. Shortly before having respected tweets deleted.

Let’s take a look on what my phone cached:

The deleted tweets contain a link http://intel.ly/2eRl73j which resolves to https://security-center.intel.com/advisory.aspx?intelid=INTEL-SA-00055&languageid=en-fr which is an advisory seemingly describing exactly what happened to me:

“In systems with the SATA devsleep feature enabled, setting or resetting the user or master password by the ATA security feature set may cause data corruption.”

Later on:

“Intel became aware of this issue during early customer validation.”

I guess I just became aware of being part of the “early customer validation”-program. This issue: Personally validated. Check.

Ok, short recap:

  • intel has a severe bug causing data loss on 540s SSD and – according to the advisory – other series as well
  • intel knows about it (advisory dates to 1st of August)
  • intel doesn’t seem to be eager to spread the word about it
  • affected intel SSDs are sold with the vulnerable firmware version
  • nobody knows a damn thing about it (recall the series 320 issue which was big)

Meanwhile, I could try to follow up on @lenovo’s tips:

Sounds good! Maybe, just maybe, that could bring my data back.

Let’s skip the second link, as it contains a dedicated Windows software I’d love to run, but my Windows installation just got wiped (and I’m not really keen of reinstalling and therewith overriding my precious maybe-still-not-yet-permamently-lost data).

The first link points to an ISO file. Works for me! Until it crashes. Reproducibly. This ISO reproducibly crashes my Lenovo X1 Carbon 3rd generation. Booting from USB thumb-drive (officially supported it says), as well as from CD. Hm.

For now I seem to have to conclude with the following questions:

  • Why there’s not I can’t find a damn thing about this bug in the media?
  • Why did intel delete its tweets referencing this bug?
  • Why does the firmware-updater doesn’t do much despite crashing my computer?
  • Why didn’t I do proper backups?!
  • How do I get my data back?!?1ß11

 

PS: Before I clicked the Publish button I again set up a few search queries. Found my tweets.

by mirko at October 28, 2016 12:55 AM

October 24, 2016

Elphel

Using a flash with a CMOS image sensor: ERS and GRR modes

Operation modes in conventional CMOS image sensors with the electronic rolling shutter

Flash test setup

Most of the CMOS image sensors have Electronic Rolling Shutter – the images are acquired by scanning line by line. Their strengths and weaknesses are well known and extremely wide usage made the technology somewhat perfect – Andrey might have already said this somewhere before.

There are CMOS sensors with a Global Shutter BUT (if we take the same optical formats):

  • because of more elements per pixel – they have lower full well capacity and quantum efficiency
  • because analog memory is used – they have higher dark current and higher shutter ratio

Some links:

So, the typical sensor with ERS may support 3 modes of operation:

  • Electronic Rolling Shutter (ERS) Continuous
  • Electronic Rolling Shutter (ERS) Snapshot
  • Global Reset Release (GRR) Snapshot

GRR Snapshot was available in the 10353 cameras but ourselves we never tried it – one should have write directly to the sensor’s register to turn it on. But now it is tested and working in 10393s available through the TRIG (0x14) parameter.

sensor

MT9P001 sensor

Further, I will be writing about ON Semi’s MT9P001 image sensor focusing on snapshot modes. The operation modes are described in the sensor’s datasheet. In short:

In ERS Snapshot mode (Fig.1,3), exposure time is constant across all rows but each next row’s exposure start is delayed by tROW (row readout time) from the previous one (and so is the exposure end).

In GRR Snapshot mode (Fig.2,4), the exposure of all rows starts at the same moment but each next row is exposed by tROW longer than the previous one. This mode is good when a flash use is needed.

The difference between ERS Snapshot and Continuous is that in the latter mode the sensor doesn’t wait for a trigger and starts new image while still finishing reading the previous one. It provides the highest frame rate (Fig.5).

Fig.1 ERS

Fig.1 Electronic Rolling Shutter (ERS) Snapshot mode

Fig.2 GRR

Fig.2 Global Reset Release (GRR) Snapshot mode

Fig.3 ERS mode, whole frame

Fig.3 ERS mode, whole frame

Fig.4 GRR whole frame

Fig.4 GRR mode, whole frame

cmos_sensor_modes

Fig.5 Sensor operation modes, frame sequence

Here are some of the actual parameters of MT9P001:

Parameter Value
Active pixels 2592h x 1944v
tROW 33.5 μs
Frame readout time (Nrows x tROW) 1944 x 33.5 μs ~ 65 ms

Test setup

  • NC393L-389
  • 9xLEDs
  • Fan (Copal F251R, 25×25 mm, rotating at 5500-8000 RPM)

The LEDs were powered & controlled by the camera’s external trigger output, the delay and duration of which are programmable.

The flash duration was set to 20 μs to catch, without the motion blur, the fan’s blades are marked with stickers – 5500-8000 RPM that is 0.5-0.96° per 20 μs. There was not enough light from the LEDs, so the setup is placed in dark environment and the camera color gains were set to 8 (ISO ~800-1000) – the images are a bit noisy.

The trigger period was set to 250 ms – and the synced LEDs were blinking for each frame.

The information on how to program the NC393 camera to generate trigger signal, fps, change sensor’s operation modes (ERS/GRR) can be found here.

Fig.6a Setup: screen, camera view

Fig.6b Setup: fan

Fig.6c Setup: fan, camera view

Flash in ERS mode

Fig.7a Fig.7b Fig.7c

In Fig.7a to expose all rows to the flash the exposure needs to be programmed so the 1st row’s end of exposure will exceed the last row’s start of exposure and the flash delayed until the exposure start of the last row. That makes the single row exposure 72ms+tflash.
Note: there is no ERS effect for moving objects – provided, of course, that the flash is much brighter than the other light sources that will be reducing the contrast during the 72ms frame time.

In Fig.7b the exposure is shorter than the frame readout time – the flash delay can be any – the result is a brighter band on the image as shown in the example below.

Another way to expose all rows is to keep the flash on from the 1st row start until the last row end (Fig.7c) – that’s as good as keeping the flash on all the time.

Example:

Diagram Screen Fan
Exposure time, ms 5 20
Flash duration, ms 0.02 (20μs) 0.02
Flash delay, ms 40 40
Comments The fan blades are motion blurred in the rows not affected by the delayed 20μs flash. The flash delay is set so the affected rows appear in the middle of the image. Exposure time defines the width of the bright rows band.

Flash in GRR mode

Fig.8 GRR, short exposure, short flash Fig.11 GRR, flash delayed to readout zone
Fig.8a Fig.8b

In GRR mode the flash does not need to be delayed and the exposure of the 1st row can be as low as tflash but the last row will be exposed for tflash+72ms (Fig.8a). If the scene is uniformly illuminated the the image tends to be darker in the top and getting brighter in the bottom. GRR is very useful with a flash lamp.
Note: No ERS effect (as in Fig.7a case).

Fig.8b just shows what happens if the flash is delayed until frame is read out.

Examples:

Diagram Screen Fan
Exposure time, ms 0.1 0.1
Flash duration, ms 0.02 0.02
Flash delay, ms 40 30
Comments The fan blades are motion blurred in the rows not affected by the delayed 20μs flash. All of the rows not read out before the flash are affected.

 

Diagram Screen Fan
Exposure time, ms 0.1 0.1
Flash duration, ms 0.02 0.02
Flash delay, ms 0 0
Comments Fan is rotating. No motion blur. In GRR if flash is not delayed the whole image is affected by the flash. Brighter environment = lower contrast.

 

Diagram Screen Fan
Fig.9 GRR, long exposure, short flash Fig.22 Fig.19
Exposure time, ms 5 10
Flash duration, ms 0.02 0.02
Flash delay, ms 0 0
Comments Fan is rotating. 100 times longer exposure compared to the previous example – the environment is relatively dark.

Conclusions

  • ERS Continuous – max fps, constant exposure, not synced
  • ERS Snapshot – constant exposure, synced
  • GRR Snapshot – synced, use this mode with flash

Links

by Oleg Dzhimiev at October 24, 2016 11:56 PM

October 20, 2016

ZeptoBARS

National Semiconductor LM330 - first LDO (1976) : weekend die-shot

LM330/LM2930 (LM130) is the first LDO linear regulator manufactured by National Semiconductor since 1976.
Die size 1723x1490 µm.



On a different die one can see funny litho/processing defect over power transistor (especially at full resolution):


Wafer also had few test chips:


That was on 3" wafer:


Thanks for the original wafers to Bob Miller, one of the designers of this chip.

October 20, 2016 07:44 AM

October 19, 2016

Free Electrons

Support for Device Tree overlays in U-Boot and libfdt

C.H.I.PWe have been working for almost two years now on the C.H.I.P platform from Nextthing Co.. One of the characteristics of this platform is that it provides an expansion headers, which allows to connect expansion boards also called DIPs in the CHIP community.

In a manner similar to what is done for the BeagleBone capes, it quickly became clear that we should be using Device Tree overlays to describe the hardware available on those expansion boards. Thanks to the feedback from the Beagleboard community (especially David Anders, Pantelis Antoniou and Matt Porter), we designed a very nice mechanism for run-time detection of the DIPs connected to the platform, based on an EEPROM available in each DIP and connected through the 1-wire bus. This EEPROM allows the system running on the CHIP to detect which DIPs are connected to the system at boot time. Our engineer Antoine Ténart worked on a prototype Linux driver to detect the connected DIPs and load the associated Device Tree overlay. Antoine’s work was even presented at the Embedded Linux Conference, in April 2016: one can see the slides and video of Antoine’s talk.

However, it turned out that this Linux driver had a few limitations. Because the driver relies on Device Tree overlays stored as files in the root filesystem, such overlays can only be loaded fairly late in the boot process. This wasn’t working very well with storage devices or for DRM that doesn’t allow hotplug of some components. Therefore, this solution wasn’t working well for the display-related DIPs provided for the CHIP: the VGA and HDMI DIP.

The answer to that was to apply those Device Tree overlays earlier, in the bootloader, so that Linux wouldn’t have to deal with them. Since we’re using U-Boot on the CHIP, we made a first implementation that we submitted back in April. The review process took its place, it was eventually merged and appeared in U-Boot 2016.09.

List of relevant commits in U-Boot:

However, the U-Boot community also requested that the changes should also be merged in the upstream libfdt, which is hosted as part of dtc, the device tree compiler.

Following this suggestion, Free Electrons engineer Maxime Ripard has been working on merging those changes in the upstream libfdt. He sent a number of iterations, which received very good feedback from dtc maintainer David Gibson. And it finally came to a conclusion early October, when David merged the seventh iteration of those patches in the dtc repository. It should therefore hopefully be part of the next dtc/libfdt release.

List of relevant commits in the Device Tree compiler:

Since the libfdt is used by a number of other projects (like Barebox, or even Linux itself), all of them will gain the ability to apply device tree overlays when they will upgrade their version. People from the BeagleBone and the Raspberry Pi communities have already expressed interest in using this work, so hopefully, this will turn into something that will be available on all the major ARM platforms.

by Maxime Ripard at October 19, 2016 08:47 PM

October 04, 2016

Free Electrons

A Kickstarter for a low cost Marvell ARM64 board

At the beginning of October a Kickstarter campaign was launched to fund the development of a low-cost board based on one of the latest Marvell ARM 64-bit SoC: the Armada 3700. While being under $50, the board would allow using most of the Armada 3700 features:

  • Gigabit Ethernet
  • SATA
  • USB 3.0
  • miniPCIe

ESPRESSObin interfaces

The Kickstarter campaign was started by Globalscale Technologies, who has already produced numerous Marvell boards in the past: the Armada 370 based Mirabox, the Kirkwood based SheevaPlug, DreamPlug and more.

We pushed the initial support of this SoC to the mainline Linux kernel 6 months ago, and it landed in Linux 4.6. There are still a number of hardware features that are not yet supported in the mainline kernel, but we are actively working on it. As an example, support for the PCIe controller was merged in Linux 4.8, released last Sunday. According to the Kickstarter page the first boards would be delivered in January 2017 and by this time we hope to have managed to push more support for this SoC to the mainline Linux kernel.

We have been working on the mainline support of the Marvell SoC for 4 years and we are glad to see at last the first board under $50 using this SoC. We hope it will help expanding the open source community around this SoC family and will bring more contributions to the Marvell EBU SoCs.

by Gregory Clement at October 04, 2016 09:36 AM

October 03, 2016

Free Electrons

Linux 4.8 released, Free Electrons contributions

Adelie PenguinLinux 4.8 has been released on Sunday by Linus Torvalds, with numerous new features and improvements that have been described in details on LWN: part 1, part 2 and part 3. KernelNewbies also has an updated page on the 4.8 release. We contributed a total of 153 patches to this release. LWN also published some statistics about this development cycle.

Our most significant contributions:

  • Boris Brezillon improved the Rockchip PWM driver to avoid glitches basing that work on his previous improvement to the PWM subsystem already merged in the kernel. He also fixed a few issues and shortcomings in the pwm regulator driver. This is finishing his work on the Rockchip based Chromebook platforms where a PWM is used for a regulator.
  • While working on the driver for the sii902x HDMI transceiver, Boris Brezillon did a cleanup of many DRM drivers. Those drivers were open coding the encoder selection. This is now done in the core DRM subsystem.
  • On the support of Atmel platforms
    • Alexandre Belloni cleaned up the existing board device trees, removing unused clock definitions and starting to remove warnings when compiling with the Device Tree Compiler (dtc).
  • On the support of Allwinner platforms
    • Maxime Ripard contributed a brand new infrastructure, named sunxi-ng, to manage the clocks of the Allwinner platforms, fixing shortcomings of the Device Tree representation used by the existing implementation. He moved the support of the Allwinner H3 clocks to this new infrastructure.
    • Maxime also developed a driver for the Allwinner A10 Digital Audio controller, bringing audio support to this platform.
    • Boris Brezillon improved the Allwinner NAND controller driver to support DMA assisted operations, which brings a very nice speed-up to throughput on platforms using NAND flashes as the storage, which is the case of Nextthing’s C.H.I.P.
    • Quentin Schulz added support for the Allwinner R16 EVB (Parrot) board.
  • On the support of Marvell platforms
    • Grégory Clément added multiple clock definitions for the Armada 37xx series of SoCs.
    • He also corrected a few issues with the I/O coherency on some Marvell SoCs
    • Romain Perier worked on the Marvell CESA cryptography driver, bringing significant performance improvements, especially for dmcrypt usage. This driver is used on numerous Marvell platforms: Orion, Kirkwood, Armada 370, XP, 375 and 38x.
    • Thomas Petazzoni submitted a driver for the Aardvark PCI host controller present in the Armada 3700, enabling PCI support for this platform.
    • Thomas also added a driver for the new XOR engine found in the Armada 7K and Armada 8K families

Here are in details, the different contributions we made to this release:

by Alexandre Belloni at October 03, 2016 12:12 PM

Elphel

Elphel presenting at ORCONF 2016, An open source digital design conference

On October 8th, 2016 Andrey will be presenting his work on VDT – Free Software Environment for FPGA Development at an open source digital design conference, ORCONF 2016. ORCONF 2016

 

The conference will take place in Bologna, Italy, and we are glad for the possibility to meet some of European users of Elphel cameras, and to connect with the community of developers excited about open source design, free software and open hardware.

Elphel will be present at the conference by Andrey Filippov from USA headquarters and Alexadre Poltorak, founder of Swiss 3D4Pi mobile mapping company, working closely with Elphel to integrate Eyesis4Pi, stereophotogrammetric camera, for the purpose of image based 3D reconstruction applications. Andrey will bring and demonstrate the new multisensor NC393 H-camera and Alexandre plans to take some panoramic footage with Eyesis4Pi camera, while in Bologna.

by olga at October 03, 2016 05:12 AM

September 30, 2016

Free Electrons

Free Electrons at the X.org Developer Conference 2016

The X.org Foundation hosts every year around september the X.org Developer Conference, which, unlike its name states, is not limited to X.org developers, but gathers all the Linux graphics stack developers, including X.org, Mesa, wayland, and other graphics stacks like ChromeOS, Android or Tizen.

This year’s edition was held last week in the University of Haaga-Helia, in Helsinki. At Free Electrons, we’ve had more and more developments on the graphic stack recently through the work we do on Atmel and NextThing Co’s C.H.I.P., so it made sense to attend.

XDC 2016 conference

There’s been a lot of very interesting talks during those three days, as can be seen in the conference schedule, but we especially liked a few of those:

DRM HWComposer – SlidesVideo

The opening talk was made by two Google engineers from the ChromeOS team, Sean Paul and Zach Reizner. They talked about the work they did on the drm_hwcomposer they wrote for the Pixel C, on Android.

The hwcomposer is one of the HAL in Android that interfaces between Surface Flinger, the display manager, and the underlying display driver. It aims at providing hardware composition features, so that Android can leverage the capacities of the display engine to perform compositions (through planes and sprites), without having to use the CPU or the GPU to do this work.

The drm_hwcomposer started out as yet another hwcomposer library implementation for the tegra-drm driver in Linux. While they implemented it, it turned into some generic enough implementation that should be useful for all the DRM drivers out there, and they even introduced some particularly nice features, to split the final screen content into several planes based on the actual displayed content rather than on windows like it’s usually done.

Their work also helped to point out a few flaws in the hwcomposer API, that will eventually be fixed in a new revision of that API.

ARC++ SlidesVideo

The next talk was once again from a ChromeOS engineer, David Reveman, who came to show his work on ARC++, the component in ChromeOS that allows to run Android applications. He was obviously mostly talking about the display side.

In order to achieve that, he had to implement an hwcomposer that would just act as a proxy between SurfaceFlinger and Wayland that is used on the ChromeOS side. The GL rendering is still direct though, and each Android application will talk directly to the GPU, as usual. Only the composition will be forwarded to the ChromeOS side.

In order to minimize that composition process, whenever possible, ARC++ tries to back each application with an overlay so that the composition would happen directly in hardware.

This also led to some interesting challenges, especially since some of the assumptions of both systems are in contradiction. For example, any application can be resized in ChromeOS, while it’s not really a thing in Android where all the applications run full screen.

HDR Displays in Linux – SlidesVideo

The next talk we found interesting was Andy Ritger from nVidia explaining how the HDR displays were supposed to be handled in Linux.

He first started by explaining what HDR is exactly. While the HDR is just about having a wider range of luminance than on a regular display, you often also get a wider gamut with HDR capable displays. This means that on those screens you can display a wider range of colors, and with a better range and precision in their intensity. And
while the applications have been able to generate HDR content for more than 10 years, the rest of the display stack wasn’t really ready, meaning that you had convert the HDR colors to colors that your monitor was able to display, using a technique called tone mapping.

He then explained than the standard, non-HDR colorspace, sRGB, is not a linear colorspace. This means than by doubling the encoded luminance of a color, you will not get a color twice brighter on your display. This was meant this way because the human eye is much more sensitive to the various shades of colors when they are dark than when they are bright. Which essentially means that the darker the color is, the more precision you want to get.

However, the luminance “resolution” on the HDR display is so good that you actually don’t need that anymore, and you can have a linear colorspace, which is in our case SCRGB.

But drawing blindly in all your applications in SCRGB is obviously not a good solution either. You have to make sure that your screen supports it (which is exposed through its EDIDs), but also that you actually tell your screeen to switch to it (through the infoframes). And that requires some support in the kernel drivers.

The Anatomy of a Vulkan Driver – SlidesVideo

This talk by Jason Ekstrand was some kind of a war story of the bring up Intel did of a Vulkan implementation on their GPU.

He first started by saying that it was actually a not so long project, especially when you consider that they wrote it from scratch, since it took roughly 3 full-time engineers 8 months to come up with a fully compliant and open source stack.

He then explained why Vulkan was needed. While OpenGL did amazingly well to cope with the hardware evolutions, it was still designed over 20 years ago, This proved to have some core characteristics that are not really relevant any more, and are holding the application developers back. For example, he mentioned that at its core, OpenGL is based on a singleton-based state machine, that obviously doesn’t scale well anymore on our SMP systems. He also mentioned that it was too abstracted, and people just wanted a lower level API, or that you might want to render things off screen without X or any context.

This was fixed in Vulkan by effectively removing the state machine, which allows it to scale, push things like the error checking or the synchronization directly to the applications, making the implementation much simpler and less layered which also simplifies the development and debugging.

He then went on to discuss how we could share the code that was still shared between the two implementations, like implementing OpenGL on top of Vulkan (which was discarded), having some kind of lighter intermediate language in Mesa to replace Gallium or just sharing through a library the common bits and making both the OpenGL and Vulkan libraries use that.

Motivating preemptive GPU scheduling for real-time systems – SlidesVideo

The last talk that we want to mention is the talk on preemptive scheduling by Roy Spliet, from the University of Cambridge.

More and more industries, and especially the automotive industry, offload some computations to the GPU for example to implement computer vision. This is then used in a car to implement the autonomous driving to make the car recognize signs or stay in its lane. And obviously, this kind of computations are supposed to be handled in a real time
system, since you probably don’t want your shiny user interface for the heating to make your car crash in the car before it because its rendering was taking too long.

He first started to explain what real time means, and what the usual metrics are, which should to no surprise to people used to “CPU based” real time systems: latency, deadline, execution time, and so on.

He then showed a bunch of benchmarks he used to test his preemptive scheduler, in a workload that was basically running OpenArena while running some computations, on various nouveau based platforms (both desktop-grade GPUs, and embedded SoCs).

This led to some expected conclusions, like the fact that a preemptive scheduler is indeed adding some overhead, but is on average worth it, while some have been quite interesting. He was for example observing some worst case latencies that were quite rare (0.3%), but were actually interferences from the display engine filling up its empty FIFOs, and creating some contention on the memory bus.

Conclusion

Overall, this has been a great experience. The organisation was flawless, and the one-track-only format allows you to meet easily both the speakers and attendees. The content was also highly technical, as you might expect, which made us learn a lot and led us to think about some interesting developments we could do on our various projects in the future, such as NextThing Co’s CHIP.

by Maxime Ripard at September 30, 2016 08:44 AM

Altus Metrum

Second Retirement

At the end of August 2012, I announced my Early Retirement from HP. Two years later, my friend and former boss Martin Fink successfully recruited me to return to what later became Hewlett Packard Enterprise, as an HPE Fellow working on open source strategy in his Office of the CTO.

I'm proud of what I was was able to accomplish in the 25 months since then, but recent efforts to "simplify" HPE actually made things complicated for me. Between the announcement in late June that Martin intended to retire himself, and the two major spin-merger announcements involving Enterprise Services and Software... well...

The bottom line is that today, 30 September 2016, is my last day at HPE.

My plan is to "return to retirement" and work on some fun projects with my wife now that we are "empty nesters". I do intend to remain involved in the Free Software and open hardware worlds, but whether that might eventually involve further employment is something I'm going to try and avoid thinking about for a while...

There is a rocket launch scheduled nearby this weekend, after all!

by bdale's rocket blog at September 30, 2016 04:23 AM

September 26, 2016

Village Telco

SECN 4.0 Firmware Available

mp2_phone_resetThe fourth release of the Small Enterprise / Campus Network (SECN) firmware for MP02, Ubiquity and TP Link devices, designed to provide combined telephony and data network solutions is now available for download.

The major features of this update are:

  • Updated OpenWrt version to Chaos Calmer version
  • Updated stable batman-adv  mesh software to version 2016.1
  • Added factory restore function from Hardware Reset button

Unless you are running a network with some of the first generation Mesh Potatoes, you should consider upgrading to this firmware.   The new factory reset function is particularly handy in that any device can be reset to its factory firmware settings by holding down the reset button for 15 seconds.

Stable firmware is available here:

MP02 –  http://download.villagetelco.org/firmware/secn/stable/mp-02/SECN_4/
TP-Link – http://download.villagetelco.org/firmware/secn/stable/tp-link/SECN_4/
Ubiquiti – http://download.villagetelco.org/firmware/secn/stable/ubnt/SECN_4/

Please subscribe to the Village Telco community development list if you have questions or suggestions.

by steve at September 26, 2016 04:25 PM

September 25, 2016

Bunnie Studios

Name that Ware, September 2016

The Ware for September 2016 is shown below.

Thanks to J. Peterson for sharing this ware!

by bunnie at September 25, 2016 09:45 AM

Winner, Name that Ware August 2016

After reading through the extensive comments on August’s ware, I’m not convinced anyone has conclusively identified the ware. I did crack a grin at atomicthumbs’ suggestion that this was a “mainboard from a Mrs. Butterworth’s Syrup of Things sensor platform”, but I think I’ll give the prize (please email me to claim it) once again to Christian Vogel for his thoughtful analysis of the circuitry, and possibly correct guess that this might be an old school laser barcode scanner.

The ware is difficult to evaluate due to the lack of a key component — whatever it is that mounts into the pin sockets and interacts with the coil or transformer near the hole in the center of the circuit board. My feeling is the placement of that magnetic device is not accidental.

A little bit of poking around revealed this short Youtube video which purports to demonstrate an old-school laser barcode mechanism. Significantly, it has a coil of similar shape and orientation to that of this ware, as well as three trimpots, although that could be a coincidence. Either way, thanks everyone for the entertaining and thoughtful comments!

by bunnie at September 25, 2016 09:45 AM

September 19, 2016

Elphel

NC393 development progress and the future plans

Since we started to deliver first NC393 series cameras in May we were working on the cameras software – original version was rather limited. While it was capable of serving images/video over the network and recording them on the internal m.2 SSD, it did not have the advanced image acquisition control (through the GUI and programmatically) that was standard for the earlier NC353 series. Now the core functionality is operational and in a month we plan to have the remaining parts (inter-camera synchronization, working with multiple sensors per-port with 10359 multiplexer, GPS+IMU logging) online too. FPGA code is already ported, but it needs to be tested and a fair amount of troubleshooting, identifying the problems and weeding out the bugs is still left to be done.

Fig 1. Four camvc instances for four channels of NC393 camera

Fig 1. Four camvc instances for the four channels of NC393 camera

Users of earlier Elphel cameras can easily recognize familiar camvc web interface – Fig. 1 shows a screenshot of the four instances of this interface controlling 4 sensors of NC393 camera in “H” configuration.

This web application tests multiple underlaying pieces of software in the camera: FPGA code, Linux kernel drivers that control the low level of the camera operation and are handling 8 interrupts from the imaging subsystem (NC353 camera processor had just one), PHP extension to interact with the drivers, image server, histograms visualization program, autoexposure and white balance daemons as well as multiple PHP scripts and Javascript code. Luckily, the higher the level, the less changes we needed in the code from the NC353 (in most cases just a single new parameter – sensor port had to be introduced), but the debugging process included going through all the levels of code – bug chasing could start from Javascript code, go to PHP code, then to PHP extension, to kernel driver, direct FPGA control from the Python code (bypassing drivers), simulating Verilog code with Cocotb. Then, when the problem was identified and the HDL code corrected (it usually required several more iterations with simulation), the top level programs were tested again with the new FPGA bitstream. And this is the time when the integration of all the development in the same Eclipse IDE is really paying off – easy code navigation, making changes to different language programs – and the software was rebuilding and transferring the results to the target system automatically.

Camera core software

NC393 camera software aims the same goals as the previous models – allow the full speed operation of the imagers while minimizing real-time requirements to the software on the two levels:

  • kernel level (tolerate large delays when waiting for the interrupts to be served) and
  • application level – allow even scripting languages to keep up with the hardware

Interrupt latency is usually not a problem what working with full frame multi-megapixel images, but the camera can operate a small window at high FPS too. Many operations with the sensor (like changing resolution or image size) require coordinated updating sensor internal registers (usually over I²C connection), changing parameters of the sensor-to-memory FPGA channel (with appropriate latency), parameters of the memory-to-compressor channel, and parameters of the compressor itself. Additionally the camera software should provide the modified image headers (reflecting the new window size) when the acquired image will be recorded or requested over the network.

Application software just needs to tell when (at what frame number) it needs the new window size and the kernel plus FPGA code will take care of the rest. Slow software should just tell in advance so the camera code and the sensor itself will have enough time to execute the request. Multiple parameters modifications designated for a specific frame will be applied almost simultaneously even if frame sync pulses where received from the sensor while application was sending the new data.

Image-derived data remains available long after the image is acquired

Similar things happen with the data received from the sensor – image itself and histograms (they are used for the automatic exposure adjustment and white balancing). Application does not need to to read them before the next frame data arrives – compressed images are kept in a large (64MB per port) ring buffer in the system memory – it can keep record of several seconds of images. Histograms (for up to 4 different windows inside the full image for each sensor port) are preserved for 15 frames after being acquired and transferred over DMA to the system memory. Subset of essential acquisition parameters and image metadata (needed for Exif output) are preserved for 2048 and 511 frames respectively.

Fig 2. Interaction of the image sensor, FPGA, kernel drivers and user space applications

Fig 2. Interaction of the image sensor, FPGA, kernel drivers and user space applications

FPGA frame-based command sequencers

There are 2 sequencers for each of the four sensor ports on the FPGA level – they do not use any of the CPU resources:

  • I²C sequencers handle relatively slow i2c commands to be sent to the senor, usually these commands need to arrive before start of the next frame,
  • Command sequencers perform writes to the memory-mapped registers and so control the FPGA operation. These operations need to happen in guaranteed time just after the start of frame, before the corresponding subsystems begin to process the incoming image data.

Both are synchronized by the “start of frame” signals from the sensor, each sequencer has 16 frame pages, each page contains 64 command slots.

Sequencers allow absolute (modulo 16) frame address and relative (to current) frame address. Writing to the current frame (zero offset) is interpreted as “ASAP” and the commands are issued immediately, not synchronized by the start of frame. Additionally, if the commands were written too late and the frame sync arrived before they were executed, they will still be processed before the next frame slot page is activated.

Kernel support of the image frame parameters

There are many frame-related parameters that control image acquisition in the camera, including various sensor register settings, parameters that control gamma conversion, image format for recording to video memory (dedicated to FPGA DDR3 not shared with the CPU), compressor format, signal gains, color saturations, compression quality, coring parameters, histogram windows size and position. There is no such thing as the “current frame parameters” in the camera, at any given moment the sensor may be programmed for a certain image size, while its output data reflects the previous frame format, and the compressor is still not finished with even earlier image. That means that the camera should be aware of multiple sets of the same parameters, each applicable to a certain frame (identified by an absolute frame number). In that case the sensor “now” is receiving not the “current” frame parameters, but the frame parameters of a frame that will happen 2 frame intervals later.

Current implementation keeps parameters (all parameters are unsigned long) in a 16-element ring buffer, each element being a

/** Parameters block, maintained for each frame (0..15 in NC393) of each sensor channel */
struct framepars_t {
        unsigned long pars[927];      ///< parameter values (indexed by P_* constants)
        unsigned long functions;      ///< each bit specifies function to be executed (triggered by some parameters change)
        unsigned long modsince[31];   ///< parameters modified after this frame - each bit corresponds to one element in in par[960] (bit 31 is not used)
        unsigned long modsince32;     ///< parameters modified after this frame super index - non-zero elements in in mod[31]  (bit 31 is not used)
        unsigned long mod[31];        ///< modified parameters - each bit corresponds to one element in in par[960] (bit 31 is not used)
        unsigned long mod32;          ///< super index - non-zero elements in in mod[31]  (bit 31 is not used)
};

Interrupt driven processing of the parameters take CPU time (in contrast with the FPGA sequencers described before), so the processing should be efficient and not iterate through almost a thousand entries for each interrupt, It is also not practical to copy a full set of parameters from the previous frame. Parameters structure for each frame include mod[31] array where each element stores a bit field that describes modification of the 32 consecutive parameters, and a single mod32 represents each mod as a single bit. So mod32 == 0 means that there were no changes (as is true for the majority of frames) and there is nothing to do for the interrupt service routine. Additional fields modsince[31] and modsince32 mean that there were changes to the parameter after this frame. It is used to initialize a new (15 frames ahead of “now”) frame entry in the ring buffer. The buffer is modulo 16, so parameters for [this_frame + 15] share the same memory address as [this_frame-1], and if the parameter is not “modified since” (as is true for the majority of parameters), nothing has to be done for it when advancing this_frame.

There is a configurable parameter that tells parameter processing at interrupts how far to look ahead in the future (Fig.2 shows frames that are too far in the future hatched). The function starts with the current frame and proceeds in the future (up to the specified limit) looking for modified, but not yet processed parameters. Processing of the modified parameters involves calling of up to 32 “generic”(sensor-agnostic) functions and up to 32 their sensor-specific variants. Each parameter that triggers some action if modified is assigned a bitmask of functions to schedule on change, and when the parameter is written to buffer, the functions field for the frame is OR-ed, so during the interrupt only this single field has to be considered.

Processing parameters in a frame scans all the bits in functions (in defined order, starting from the LSB, generic first), the functions involve verification and calculation of derivative values, writing data to the FPGA command and I²C sequencers (deep green and blue on Fig. 2 show the new added commands to the sequencers). Additionally some actions may schedule other parameters changes to be processed at later frame.

User space applications and the frame parameters

Application see frame parameters through the character device driver that supports write, mmap, and (overloaded) lseek.

  • write operation allows to set a list of parameters and apply these changes to a particular frame as a single transaction
  • mmap provides read access to all the frame parameters for up to 15 frames in the future, parameter defines are provided through the header files under kernel include/uapi, so applications (such as PHP extension) can access them by symbolic names.
  • lseek is heavily overloaded, especially for positive offsets to SEEK_END – such commands initiate special actions in this driver, such as waiting for the specific frame. It is partially used instead of the ioctl command, because lseek is immediately supported in most languages while ioctl often requires special extensions.

Communicating image data to the user space

Similar to handling of the frame acquisition and processing parameters, that deals with the future and lets even slow applications to control the process being frame-accurate, other kernel drivers use the FPGA code features to give applications sufficient time to process acquired data before it is overwritten by the newer one. These drivers use similar character device interface with mmap for data access and lseek for control, some use write to send data to the driver.

  • circbuf driver provides access to the compressed image data in any of the four 64MB ring buffers that contain compressed by the FPGA data (FPGA also provides the microsecond-accurate timestmap and the image size). Each image is 32-byte aligned, FPGA skips additional 32 bytes after each frame. Compressor interrupt service routine (located in sensor_common.c) fills this area with some of the image acquisition metadata.
  • histograms driver handles the histograms for the acquired images. Histograms are calculated in the FPGA on the image-to-memory path and so are active even if compressor is stopped. There are 3 types of histogram data that may be needed by the applications, and only the first one (direct) is provided by the FPGA over DMA, two others (derivative) are calculated in the driver and cached, so application request for the same derivative histogram does not require re-calculation. Histograms are calculated for the pixels after gamma-conversion even if raw (2 bytes/pixel) data is recorded, so table indices are always in the range of 0 to 255.
    • direct histograms are provided by the FPGA that maintains data for 16 consecutive (last acquired) frames, for each of the 4 color channels (2 separate green ones), for each of the sensor ports and sub-channels (when multiplexers are used). Each frame data contain 256*4=1024 of the unsigned long (32 bit) values.
    • cumulative histograms contain the corresponding cumulative values, each element equals to sum of the direct histogram values from 0 to the specified index. When divided by the value at index 255 (total number of pixel of this color channel =1/4 of all pixels in WOI) the result will tell what part of all pixels have values less or equal to the current.
    • percentiles are reversed cumulative histograms, they tell what is the pixel level for which a certain fraction of all pixels has a value of equal or below it. These values refer to non-linear (gamma-converted) pixel values, so automatic exposure also uses reversed gamma tables and does interpolation between the two values in the percentile table.
  • jpeghead driver generates JPEG/JP4 headers that need to be concatenated with the compressed output from circbuf (and with the end-of-image 0xff/0xd9 marker) to make a complete image file
  • exif driver manipulates Exif data in the camera – it stores Exif frame-variable data for the last acquired frames in a 512-element ring buffer, allows to specify and set additional Exif fields, provides mmap read access to the metadata.

Camera applications

Current applications include

  • Elphel PHP extension allows multiple PHP scripts to work in the camera, providing server-side of the web applications functionality, such as camvc.
  • imgsrv is a fast image server that bypasses camera web server and transfers images and metadata avoiding any copying of extra data – network controller sends data over DMA from the same buffer where FPGA delivered compressed data (also over DMA). Each sensor port has a corresponding instance of imgsrv, serving different network ports.
  • camogm allows simultaneous recording image data from multiple channels at up to 220 MB/s
  • autoexposure is an auto exposure and white balance daemon that uses image histograms for the specified WOI to adjust exposure time, sensor analog gains and signal gain coefficients in the FPGA.
  • pnghist is a CGI program that visualizes histograms as PNG images, it supports several histogram presentation modes.

Other applications that were available in the earlier NC353 series cameras (such as RTP/RTSP video streamer) will be ported shortly.

Future plans

NC393 camera has 12 times higher performance than the earlier NC353 series, and porting of the functionality of the NC353 is much more than just tweaking of the FPGA code and the drivers – large portions had to be redesigned completely. Camera FPGA project includes provisions for advanced image processing, and that changed the foundation of the camera code. That said, it is much more exciting to move forward and implement functionality that did not exist before, but we had to finish that “boring” part first. And as now it is coming closer, I would like to share our future development plans and invite others who may be interested to cooperate.

New sensors

NC393 was designed to have maximal flexibility in the sensor interface – this we learned from our experience with 303-313-333-353 series of cameras. So far NC393 is tested with one parallel interface sensor and one with a 4-lane HiSPI interface (both have links to the circuit diagrams). Each port can use 8 lanes+clock (9 differential) pairs and several more control/clock signals. Larger/faster sensors may use multiple sensors ports and so multiply available interface lines.
It will be interesting to try high sensitivity large pixel E2V sensors and ToF technology. TI OPT8241 seems to be a good fit for NC393, but OPT8241 I²C register map is not provided.

Quadcopters flying Star Wars style

Most quadcopters use brushless DC motors (BLDC) that maybe tricky to control. Integrated motor controllers that detect rotor position using the voltage on the power coils or external sensors (and so emulate ancient physical brushes) work fine when you apply only moderate variations to the rotation speed but may fail if you need to change the output fast and in precisely calculated manner. FPGA can handle such calculations better and leave CPU resources for the high level tasks. I would imagine such motor control to include some tiny FPGA paired with the high-current MOSFET drivers attached to the motors. Then use lightweight SATA cables (such as 3m 5602 series) to connect them to the NC393 daughter board. NC393 already has dual ARM CPU so it can use existing free software to fly drones and take video/images at the same time. Making it not just fly, but do “tricks” will be really exciting.

Image processing and High Level Synthesis (HLS) alternative

NC393 FPGA design started around a 16-channel memory access optimized for 2d data. Common memory may be not the most modern approach to parallel processing, but when the bulk memory (0.5GB of the DDR3) is a single device, it has to be shared between the channels and not all the module connection can be converted to simple stream protocols. Even before we started to add image processing, we have to maintain two separate bitstreams – one for the parallel sensors, and the other – for HiSPI (serial) ones. They can not be made run-time programmable as even the voltage levels are different, to say nothing that both interfaces together will not fit into Zynq FPGA – we already balancing around 80% of the slice utilization. Theoretically NC393 can use two of the parallel and 2 serial sensors (two pairs of sensor ports use two separate I/O banks with individually programmable supply voltage), but that adds even more variants to the top level module configuration and matching constraints files, and makes the code less readable.

Things will get even more complicated when there will be more active memory channels involved in the processing, especially when the inter-synchronization of the different modules processing multi-sensor 2d data is more complex than just stream in/stream out.

When processing muti-view scenes we will start with de-warping followed by FFT to implement correlation between the 4 simultaneous images and so significantly reduce ambiguity of a stereo-pair correlation. In parallel with working on Verilog code for the new modules I plan to try to reduce the complexity of the inter-module connections, making it more flexible and easier to maintain. I would love to use something higher level, but unfortunately there is nothing for me to embrace and use.

Why I do not believe in HLS

Focusing on the algorithmic level and leaving RTL implementation to the software is definitely a good idea, but the task is much more ambitious than to try to replace GCC or GNU/Linux operating system that even most proprietary and encryption-loving companies have to use. The gap between the algorithms and RTL code is wider than between the C code and the Assembler for the CPU, regardless of some nice demos with the Sobel filter applied to the live video stream or similar simple processing.

One of the major handicaps of the existing approach is an obsession with making modern reprogrammable FPGA code mimic the fixed-function hardware integrated circuits popular in the last century. To be software-like is much more powerful than to look like some old hardware. It is sure that separation of the application levels, use of the standard APIs are important, but it is most beneficial in the mature areas. In the new ones I consider it to be a beauty of coding to be able to freely cross any implementation levels, break some good programming practices, adjust it here and there, redesign and start over, balance overall performance and structure to create something new. Features and interfaces freeze will come later.

So what to use instead?

I do not yet know what it should be exactly, but I would borrow Python decorators and functionality of Verilog generate operators. Instead of just instantiating “black boxes” with rigid interfaces – allow the wrapper code (both automatically generated and hand-crafted) to get inside the instantiated modules code and modify it for the particular instances. “Decoration” meaning generation of the modified module code for the specific instances. Something like programmatic parametrization (modifying code, not just the parameter values, even those that direct generate operators).

Elphel FPGA code is source code based, there are zero of the “black boxes” in the design. And as all the code (109579 lines of it) is available it is accessible for the software too, and “robots” can analyze it and make it easier to manage. We would like to have them as “helpers” not as “wizards” who can offer just a few choices among the pre-programmed options.

To some extend we already do have such “helpers” – our current Python code “understands” Verilog parameter definitions in the source code, including some calculations of the derivative ones. That makes it possible for the Python programs running in the camera to use the same register addresses and bit fields as defined for the FPGA code implemented in the current bitstream.

When the cameras became capable of running FPGA code controlled by the Python program and we were ready to develop kernel drivers, we added extra functionality to the existing Python code. Now it is able not just to read Verilog parameters for itself, but also to generate C code to facilitate drivers development. This converter is not a compiler-like program that takes Verilog input and generates C header files. It is still a human-coded program that retrieves the parameters values from the Verilog code and helps developer by using familiar content-assist functionality of the IDE, detects and flags misspelled parameter names in PyDev (Eclipse IDE plugin for Python), re-generates output when the Verilog source is modified.

We also used Python to generate Verilog code for AHCI implementation, it seemed more convenient than native Verilog generate. Wrapping Verilog in Python and generating clean (for human analysis) Verilog code that can be used in wave viewer and in implementation tools timing analysis. It will be quite natural to make the Python programs understand more of Verilog code and help us manage the structure, generate matching constraints files that FPGA implementation tools require in addition to the HDL code. FPGA professionals probably use TCL scripts for that, it may be a nice language but I never used it outside of the FPGA scripting, so it is always a problem for me to recall how to use it when coming back to FPGA coding after long interruptions.

I did look at MyHDL of course, but it is not exactly what I need. MyHDL tries to replace Verilog completely and the structural modeling part of it suffers from the focus on RTL. I just want Python to help me with Verilog code, not to replace it (similar to how I do not think that Verilog is the best language to simulate CPU activities). I love Cocotb more – even its gentle name (COroutine based COsimulation) tells me that it is not “instead of” but “in addition to”. Cocotb does not have a ready solution for this project either (it was never a goal of this program) so here is an interesting project to implement.

There are several specific cases that I would like to be handled by the implementation.

  • add new functionally horizontal connections in a clean way between hierarchical objects: add outputs all the way up to the common parent module, wires at the top, and then inputs all the way down to the destination. Of course it is usually better to avoid such extra connections, but their traces in module ports help to keep them under control. Such connections may be just temporary and later removed, or be a start of adding new functionality to the involved modules.
  • generate a low footprint debug network to selected hierarchical modules and generate target Python code to probe/modify registers through this network accessing data by the HDL hierarchical names.
  • control the destiny of the decorators – either keep them as separate pieces of code or merge with the original source and make the result HDL code a new “co-designed” source.

And this is what I plan to start with (in parallel to adding new Verilog code). Try to combine existing pieces of the solution and make it a complete one.

by Andrey Filippov at September 19, 2016 07:41 PM

September 13, 2016

Elphel

Reaching 220 MB/s sustained write speed with SATA-2 controller

Introduction

Elphel cameras use camogm, a user space application, for recording acquired images to a disk storage. The application is developed to use such storage devices as disk drives or USB drives mounted in the operating system. The Elphel393 model cameras have SATA-2 controller implemented in FPGA, a system driver for this controller, and they can be equipped with an SSD drive. We were interested in performing write speed tests using the SATA controller and a couple of M.2 SSDs to find out the top disk bandwidth camogm can use during image recording. Our initial approach was to try a commonly accepted method of using hdparm and dd system utilities. The first disk was SanDisk SD8SMAT128G1122. According to the manufacturer specification [pdf], this is a low power disk for embedded applications and this disk can show 182 MB/s sequential write speed in SATA-3 mode. We had the following:

~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 274 MB in  3.02 seconds =  90.70 MB/sec

~# time sh -c "dd if=/dev/zero of=/dev/sda2 bs=500M count=1 &amp;&amp; sync"
1+0 records in
1+0 records out

real	0m6.096s
user	0m0.000s
sys	0m5.860s

which results in total write speed around 82 MB/s.

The second disk was Crusial CT250MX200SSD6 [pdf] and its sequential write speed should be 500 MB/s in SATA-3 mode. We had the following:

~# hdparm -t /dev/sda2
/dev/sda2:
Timing buffered disk reads: 236 MB in  3.01 seconds =  78.32 MB/sec

~# time sh -c "dd if=/dev/zero of=/dev/sda2 bs=500M count=1 &amp;&amp; sync"
1+0 records in
1+0 records out

real	0m6.376s
user	0m0.010s
sys	0m5.040s

which results in total write speed around 78 MB/s. Our preliminary tests had shown that the controller can achieve 200 MB/s write speed. Taking this into consideration, the performance figures obtained were not very promising, so we decided to add one new feature in the latest version of camogm – the ability to write data to a raw storage device. Raw storage device is a disk or a disk partition with direct access to hardware bypassing any operating system caches and buffers. Such type of access can potentially improve I/O performance but requires additional efforts to implement data management in software.

First approach

We tried to bypass file system in the first attempt and used device file (/dev/sda in our case) in camogm for I/O operations. We compared CPU load and I/O wait time during write operation to a partition with ext4 file system and to a device file. dstat turned to be a very handy tool for generating system resource statistics. The statistics were collected during 3 periods of operation: in idle mode before writing, during writing, and in idle mode after writing. All these periods can be clearly seen on the figures below. We also changed the quality parameter which affects the resulting size of JPEG files. Files with quality parameter set to 80 were around 1 MB in size and files with quality parameter set to 90 were almost 2 MB in size.

sys-q80
sys-q90

As expected, the figures show that device file write operation takes less CPU time than the same operation with file system, because there no file system operations and caches involved.

wai-q80
wai-q90

CPU wait for disk IO on the figures means the amount of time in percent the CPU waits for an I/O operation to complete. Here camogm process spends more CPU time waiting for data to be written during device file operations than during file system operations, and again this could be explained by the fact that caching on the file system level in not used.

We also measured the time camogm spent on writing each individual file to device file and to files on ext4 file system.

write-q80
write-q90

The clear patterns on the figures correspond to several sensor channels used during recording and each channel produced JPEG files different in size from the other channels. As we have already seen, file system caching has its influence on the results and the difference in overall write time becomes less obvious when the size of files increases.

Although the tests had shown that writing data to file system and to device file had different overall performance, we could not achieve any significant performance gain which would narrow the gap between initial results and preliminary write speed data. We decided to try another approach: only pass commands to disk driver and write data from disk driver.

Second approach

The idea behind this approach was simple. We already have JPEG data in circular buffer in memory and disk driver only needs pointers to the data we want to write at any given moment in time. camogm was modified to pass those pointers and some meta information to driver via its sysfs interface. We modified our AHCI driver as well to add new functions. The driver accepts a command from camogm, aligns data buffers to a predefined boundary and a frame in total to a physical sector boundary, and places the command to command queue. Commands are picked from the command queue right after current disk transaction is complete. We measured the time spent by driver preparing a new command, waiting for an interrupt after a command had been issued, and waiting for a new command to arrive. Total data size per each transaction was around 9.5 MB in case of SD8SMAT128G1122 and around 3 MB in case of CT250MX200SSD6. The disks were installed in cameras with 14 Mpx and 5 Mpx sensors respectively.

write-sd
write-ct

These figures show that the time spent in the driver on command preparation is almost negligible in comparison to the time spent waiting for the write command to complete and this was exactly what we finally wanted to get. We could achieve almost 160 MB/s write speed for SD8SMAT128G1122 and around 220 MB/s for CT250MX200SSD6. Here is a summary of results obtained in different modes of writing for two test disks:

Disk write performance
Disk File system access Device file access Raw driver access
SD8SMAT128G1122 82 MB/s 90 MB/s 160 MB/s
CT250MX200SSD6 78 MB/s 220 MB/s

CT250MX200SSD6 was not tested in device file access mode as it was clear that this method did not fit our needs.

Disk access sharing

One of the problems we had to solve while working on the driver was disk access sharing from operating system and from driver during recording. The disk in camera had two partitions, one was formatted to ext4 file system and mounted in operating system and the other was used as a data buffer for camogm. It is possible that some user space application could access mounted partition when camogm is writing data to disk data buffer and this situation should be correctly processed. camogm as a top priority process should always have the full disk bandwidth and other system processes should be granted access only during periods of time when camogm is waiting for the next frame. libata has built-in command deferral mechanism and we used this mechanism in the driver to decide whether the system process should have access to disk or the command should be deferred. To use this mechanism, we added our function to ATA port operations structure:

static struct ata_port_operations ahci_elphel_ops = {
    ...
    .qc_defer       = elphel_qc_defer,
};

This function is called every time a new system command arrives and the driver can defer the command in case it is busy writing data.

by Mikhail Karpenko at September 13, 2016 10:51 PM

Free Electrons

Yocto project and OpenEmbedded training updated to Krogoth

yocto

Continuing our efforts to keep our training materials up-to-date we just refreshed our Yocto project and OpenEmbedded training course to the latest Yocto project release, Krogoth (2.1.1). In addition to adapting our training labs to the Krogoth release, we improved our training materials to cover more aspects and new features.

The most important changes are:

  • New chapter about devtool, the new utility from the Yocto project to improve the developers’ workflow to integrate a package into the build system or to make patches to existing packages.
  • Improve the distro layers slides to add configuration samples and give advice on how to use these layers.
  • Add a part about quilt to easily patch already supported packages.
  • Explain in depth how file inclusions are handled by BitBake.
  • Improve the description about tasks by adding slides on how to write them in Python.

The updated training materials are available on our training page: agenda (PDF), slides (PDF) and labs (PDF).

Join our Yocto specialist Alexandre Belloni for the first public session of this improved training in Lyon (France) on October 19-21. We are also available to deliver this training worldwide at your site, contact us!

by Antoine Ténart at September 13, 2016 12:24 PM

September 12, 2016

Free Electrons

Free Electrons at the Kernel Recipes conference

Kernel RecipesThe 2016 edition of the Kernel Recipes conference will take place from September 28th to 30th in Paris. With talks from kernel developers Jonathan Corbet, Greg Kroah-Hartmann, Daniel Vetter, Laurent Pinchart, Tejun Heo, Steven Rosdedt, Kevin Hilman, Hans Verkuil and many others, the schedule looks definitely very appealing, and indeed the event is now full.

Thomas Petazzoni, Free Electrons CTO, will be attending this event. If you’re interested in discussing business or career opportunities with Free Electrons, this event will be a great place to meet together.

by Thomas Petazzoni at September 12, 2016 12:04 PM

September 09, 2016

Elphel

A web interface for a simpler and more flexible Linux kernel dynamic debug controlling

Along with the documentation there is a number of articles explaining the dynamic debug (dyndbg) feature of the Linux kernel like this one or this. Though we haven’t found anything that would extend the basic functionality – so, we created a web interface using JavaScript and PHP on top of the dyndbg.

debugfs-webgui

Fig.1 debugfs-webgui

In most cases it all works fine – when writing a linux driver you:
1. insert pr_debug()/dev_dbg() for debug messaging.
2. compile kernel with dyndbg enabled (CONFIG_DYNAMIC_DEBUG=y)
3. then just ‘echo‘ query strings or ‘cat‘ files with commands to switch on/off the debug messages at runtime. Examples:

  • single:

echo -c 'file svcsock.c line 1603 +pfmt' > /dynamic_debug/control

  • batch file:

cat query-batch-file > /dynamic_debug/control

When it’s all small – enabling/disabling the whole file or a function is not a problem. When the driver grows big with lots of debug messages or there are a few drivers interact with each other it becomes more convenient to have multiple configurations with certain debug lines on or off. As the source code changes the lines get shifted – and so, the batch files require editing.

If the target system (embedded or not) has network and a web browser (Apache2 + PHP) a quite simple solution is to add a web interface to the dynamic debug. The one we have developed has the following features:

  • allows having multiple configurations for each file
  • displays only files of interest
  • updates debug configuration for modified files where debug lines got shifted
  • keeps/updates the current config (in json format) in tmpfs – saves to disk on button click
  • p, f, l, m, t flags are supported

Get the source code then proceed with the README.md.

by Oleg Dzhimiev at September 09, 2016 12:40 AM

September 01, 2016

Free Electrons

Free Electrons at the X Developer Conference

The next X.org Developer Conference will take place on September 21 to September 23 in Helsinki, Finland. This is a major event for Linux developers working in the graphics/display areas, not only at the X.org level, but also at the kernel level, in Mesa, and other related projects.

Free Electrons engineer Maxime Ripard will be attending this conference, with 80+ other engineers from Intel, Google, NVidia, Texas Instruments, AMD, RedHat, etc.

Maxime is the author of the DRM/KMS driver in the upstream Linux kernel for the Allwinner SoCs, which provides display support for numerous Allwinner platforms, especially Nextthing’s CHIP (with parallel LCD support, HDMI support, VGA support and composite video support). Maxime has also worked on making the 3D acceleration work on this platform with a mainline kernel, by adapting the Mali kernel driver. Most recently, Maxime has been involved in Video4Linux development, writing a driver for the camera interface of Allwinner SoCs, and supervising Florent Revest work on the Allwinner VPU that we published a few days ago.

by Thomas Petazzoni at September 01, 2016 02:58 PM

August 31, 2016

Free Electrons

Free Electrons mentioned in Linux Foundation’s report

Linux Kernel Development Report 2016Lask week, the Linux Foundation announced the publication of the 2016 edition of its usual report “Linux Kernel Development – How Fast It is Going, Who is Doing It, What They are Doing, and Who is Sponsoring It”.

This report gives a nice overview of the evolution of the Linux kernel since 3.18, especially from a contribution point of view: the rate of changes, who is contributing, are there new developers joining, etc.

Free Electrons is mentioned in several places in this report. First of all, even though Free Electrons is a consulting company, it is shown individually rather than part of the general “consultants” category. As the report explains:

The category “consultants” represents developers who contribute to the kernel as a work-for-hire effort from different companies. Some consultant companies, such as Free Electrons and Pengutronix, are shown individually as their contributions are a significant number.

Thanks to being mentioned separately from the “consultants” category, the report also shows that:

  • Free Electrons is the #15 contributing company over the 3.19 to 4.7 development period, in number of commits. Free Electrons contributed a total of 1453 commits, corresponding to 1.3% of the total commits
  • Free Electrons is ranked #13 in the list of companies by number of Signed-off-by from developers who are not the author of patches. This happens because 6 of our engineers are maintainers or co-maintainers from various areas in the kernel: they merge patches from contributors, sign-off on them, and send them to another maintainer (either arm-soc maintainers or directly Linus Torvalds, depending on the subsystem).

We’re glad to see Free Electrons mentioned in this report, which shows that we are a strong contributor to the official Linux kernel. Thanks to this contribution effort, we have tremendous experience with adding support for new hardware in the kernel, so contact us if you want your hardware supported in the official Linux kernel!

by Thomas Petazzoni at August 31, 2016 09:08 AM

August 30, 2016

Free Electrons

Support for the Allwinner VPU in the mainline Linux kernel

Over the last few years, and most recently with the support for the C.H.I.P platform, Free Electrons has been heavily involved in initiating and improving the support in the mainline Linux kernel for the Allwinner ARM processors. As of today, a large number of hardware features of the Allwinner processors, especially the older ones such as the A10 or the A13 used in the CHIP, are usable with the mainline Linux kernel, including complex functionality such as display support and 3D acceleration. However, one feature that was still lacking is proper support for the Video Processing Unit (VPU) that allows to accelerate in hardware the decoding and encoding of popular video formats.

During the past two months, Florent Revest, a 19 year old intern at Free Electrons worked on a mainline solution for this Video Processing Unit. His work followed the reverse engineering effort of the Cedrus project, and this topic was also listed as a High Priority Reverse Engineering Project by the FSF.

The internship resulted in a new sunxi-cedrus driver, a Video4Linux memory-to-memory decoder kernel driver and a corresponding VA-API backend, which allows numerous userspace applications to use the decoding capabilities. Both projects have both been published on Github:

Currently, the combination of the kernel driver and VA-API backend supports MPEG2 and MPEG4 decoding only. There is for the moment no support for encoding, and no support for H264, though we believe support for both aspects can be added within the architecture of the existing driver and VA-API backend.

A first RFC patchset of the kernel driver has been sent to the linux-media mailing list, and a complete documentation providing installation information and architecture details has been written on the linux-sunxi’s wiki.

Here is a video of VLC playing a MPEG2 demo video on top of this stack on the Next Thing’s C.H.I.P:

by Thomas Petazzoni at August 30, 2016 02:13 PM

August 18, 2016

Bunnie Studios

Name that Ware August 2016

The Ware for August 2016 is shown below.

Thanks to Adrian Tschira (notafile) for sharing this well-photographed ware! The make and model of this ware is unknown to both of us, so if an unequivocal identification isn’t made over the coming month, I’ll be searching the comments for either the most thoughtful or the most entertaining analysis of the ware.

by bunnie at August 18, 2016 04:48 PM

Winner, Name that Ware July 2016

The Ware for July 2016 was a board from a Connection Machine CM-2 variant; quite likely a CM-200.

It’s an absolutely gorgeous board, and the sort of thing I’d use as a desktop background if I used a desktop background that was’t all black. Thanks again to Mark Jessop for contributing the ware. Finally, the prize this month goes to ojn for a fine bit of sleuthing, please email me to claim your prize! I particularly loved this little comment in the analysis:

The board layout technique is different from what I’ve been able to spot from IBM, SGI, DEC. Cray used different backplanes so the connectors at the top also don’t match.

Every designer and design methodology leaves a unique fingerprint on the final product. While I can’t recognize human faces very well, I do perceive stylistic differences in a circuit board. The brain works in funny ways…

by bunnie at August 18, 2016 04:48 PM

August 16, 2016

Harald Welte

(East) European motorbike tour on 20y old BMW F650ST

For many years I've always been wanting to do some motorbike riding across the Alps, but somehow never managed to do so. It seems when in Germany I've always been too busy - contrary to the many motorbike tours around and across Taiwan which I did during my frequent holidays there.

This year I finally took the opportunity to combine visiting some friends in Hungary and Bavaria with a nice tour starting from Berlin over Prague and Brno (CZ), Bratislava (SK) to Tata and Budapeest (HU), further along lake Balaton (HU) towards Maribor (SI) and finally across the Grossglockner High Alpine Road (AT) to Salzburg and Bavaria before heading back to Berlin.

/images/f650st-grossglockner-hochalpenstrasse.jpg

It was eight fun (but sometimes long) days riding. For some strange turn of luck, not a single drop of rain was encountered during all that time, traveling across six countries.

The most interesting parts of the tour were:

  • Along the Elbe river from Pirna (DE) to Lovosice (CZ). Beautiful scenery along the river valley, most parts of the road immediately on either side of the river. Quite touristy on the German side, much more pleasant and quiet on the Czech side.
  • From Mosonmagyarovar via Gyor to Tata (all HU). Very little traffic alongside road '1'. Beautiful scenery with lots of agriculture and forests left and right.
  • The Northern coast of Lake Balaton, particularly from Tinany to Keszthely (HU). Way too many tourists and traffic for my taste, but still very impressive to realize how large/long that lake really is.
  • From Maribor to Dravograd (SI) alongside the Drau/Drav river valley.
  • Finally, of course, the Grossglockner High Alpine Road, which reminded me in many ways of the high mountain tours I did in Taiwan. Not a big surprise, given that both lead you up to about 2500 meters above sea level.

Finally, I have to say I've been very happy with the performance of my 1996 model BMW F 650ST bike, who has coincidentally just celebrated its 20ieth anniversary. I know it's an odd bike design (650cc single-cylinder with two spark plugs, ignition coils and two carburetors) but consider it an acquired taste ;)

I've also published a map with a track log of the trip

In one month from now, I should be reporting from motorbike tours in Taiwan on the equally trusted small Yamaha TW-225 - which of course plays in a totally different league ;)

by Harald Welte at August 16, 2016 02:00 PM

August 03, 2016

Free Electrons

Linux 4.7 statistics: Free Electrons engineer #2 contributor

LWN.net has published yesterday an article containing statistics for the 4.7 development cycle. This article is available for LWN.net subscribers only during the coming week, and will then be available for everyone, free of charge.

It turns out that Boris Brezillon, Free Electrons engineer, is the second most active contributor to the 4.7 kernel in number of commits! The top three contributors in number of commits are: H Hartley Sweeten (208 commits), Boris Brezillon (132 commits) and Al Viro (127 commits).

LWN.net 4.7 kernel statistics

In addition to being present in the most active developers by number of commits, Boris Brezillon is also in the #11 most active contributor in terms of changed lines. As we discussed in our previous blog post, most contributions from Boris were targeted at the PWM subsystem on one side (atomic update support) and the NAND subsystem on the other side.

Another Free Electrons engineer shows up in the per-developer statistics: Maxime Ripard is the #17 most active contributor by lines changed. Indeed, Maxime contributed a brand new DRM/KMS driver for the Allwinner display controller.

As a company, Free Electrons i