<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Cloud Atomic Laboratory</title>
    <link>https://www.cloudatomiclab.com/index.xml</link>
    <description>Recent content on Cloud Atomic Laboratory</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-gb</language>
    <copyright>Justin Cormack</copyright>
    <lastBuildDate>Sun, 23 Aug 2020 18:07:00 +0000</lastBuildDate>
    <atom:link href="https://www.cloudatomiclab.com/index.xml" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Wasp Galls and Weird Machines</title>
      <link>https://www.cloudatomiclab.com/gall-wasp-weird-machines/</link>
      <pubDate>Sun, 23 Aug 2020 18:07:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/gall-wasp-weird-machines/</guid>
      <description>&lt;p&gt;&lt;em&gt;I am really behind on my blog posts, but here is a random one that wasn&amp;rsquo;t actually on the backlog.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I have been using the &lt;a href=&#34;https://www.inaturalist.org/pages/seek_app&#34;&gt;Seek&lt;/a&gt; app quite a bit while wandering the fields near where I live during lockdown. This is a project to visually identify all species through an app that is based on a large open data set and models. It does an amazingly good job, only very occasionally making mistakes like identifying a diving duck briefly as a crocodile; more often it gets stuck at the genus level and finding the species a little too hard which is forgivable. Definitely recommended to make wandering around more interesting, it will link to interesting things about how species arrived, and things about them that you didn&amp;rsquo;t know from Wikipedia as well. And it is fascinating to be able to walk along paths you have been on many times before and still find a dozen new species.&lt;/p&gt;

&lt;p&gt;The more complex a plant is, the more parasites and attackers it has. But two species, the rose and the oak, seem to have the most interesting attackers of all. I saw this mossy ball one day and was surprised when Seek told me it was a wasp, I mean it looks like a plant. So what is it?&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./gall.jpg&#34; width=&#34;30%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;In computer security we use biological metaphors such as &amp;ldquo;antivirus&amp;rdquo; but the complexity of the natural world really shows us what a complex attacker ecosystem looks like. This is a gall, from the gall wasp species Diplolepis rosae. It is traditionally known in the UK as Robin&amp;rsquo;s pincushion, and more formally as the rose bedeguar gall or mossy rose gall. The gall wasp female lays eggs in the leaf bud of the rose, and these eggs, and later the larvae that hatch from them, manipulate the plant into growing the gall around it. This is why it looks like a plant like structure, as indeed it is, but not a normal one. In particular, the gall provides highly nutritious plant cells for the wasp larvae to eat, with the plant transporting nutrients directly to the gall for it to eat. It grows in weird ways, but using the host plants genetic material, manipulated by the wasp in ways that are not yet understood.&lt;/p&gt;

&lt;p&gt;This is exactly the mechanism of &lt;a href=&#34;http://langsec.org/papers/Bratus.pdf&#34;&gt;weird machines&lt;/a&gt; in computer security, where &amp;ldquo;the implicit data flow and the subsequent transfer of control were performed by the program’s own code, borrowed by the exploit for its own purposes.&amp;rdquo; The attacker takes gadgets and existing code fragments and applies them in unexpected, unplanned for, weird ways to make the code do things that were not intended by the author, indeed things that are totally outside the designed scope. &amp;ldquo;Borrowed pieces of code could be strung together, the hijacked control flow linking them powered by their own effects with the right crafted data arranged for each piece.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;Gall wasps are widespread, and each species produces a different type of gall, by attacking the plant in a different way. But roses and oaks seem to be the main hosts. Around where I live these particular rose species are very common, found on a lot of the wild roses. There are also several kinds of oak gall wasp around.&lt;/p&gt;

&lt;p&gt;It turns out that the galls themselves allow complex attacker communities to thrive. Other species of wasp live in the comfortable gall habitat. In general the other species are not parasitic on the gall wasps, as only these have the ability to keep attacking the host rose to keep the flow of nutrients coming. But other wasp species lay eggs in the same place a little later to also live in the same habitat, and indeed can only live in these places, a lifecycle known as &amp;ldquo;inquiline&amp;rdquo;. There are parasites on the inquilines, and a complex community of attackers; the majority of the wasps that hatch out the next year will not be the original species that caused the gall. The gall is itself easier to attack than the plant, because of how it has been manipulated into a softer mass.&lt;/p&gt;

&lt;p&gt;Another random fact about the Diplolepis rosae wasp is that almost all of them are female. This is actually in itself due to a bacterial infection of the gametes, with the bacteria manipulating the wasp so it only produces female eggs.&lt;/p&gt;

&lt;p&gt;One of the interesting things about computer security is that we are only just starting to see the structure of attacks and defence. The natural world has so many different attack and defence mechanisms that are worth exploring to see what happens when things are subverted in novel ways, or have different types of defence, or little defence at all. Or you can just wander around and learn about the amazing natural world.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Home Work Setup</title>
      <link>https://www.cloudatomiclab.com/home-work/</link>
      <pubDate>Sun, 26 Apr 2020 18:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/home-work/</guid>
      <description>

&lt;p&gt;The other day I got a message from Jenny &amp;ldquo;the Burce&amp;rdquo; saying that I had to get
some equipment to upgrade my live streaming setup for the DockerCon dry run.
Cameras and microphone and things, a list from &lt;a href=&#34;https://www.bretfisher.com/&#34;&gt;Bret
Fisher&lt;/a&gt;. Only problem, I soon discovered, was that
nothing on the list was actually available. Somehow just after lockdown
everything that people might need to live stream audio and video had been panic
bought, along with the flour, toilet paper and eggs. So over the next month or
so I have gradually put together a setup that works, with the aim of improving
the audio and video quality.&lt;/p&gt;

&lt;p&gt;It has also been the first time I have worked at home for long periods,
previously I mostly went to the office with a few meetings at home at the start
and end of the day. Given that we are all going to be homeworking for a long
period, may as well make it better. Due to lack of availability all the low end
stuff was unavailable, but will give some pointers and suggestions as to what
is worthwhile or not, and supply chains should start to improve soon. I am
lucky enough to have a reasonable amount of space, if you are working in a
constrained space I would imagine choices are more limited.&lt;/p&gt;

&lt;p&gt;Also I am lucky enough to be able to work at home, or at all in these difficult
times. Tech workers are so lucky and safe compared to so many others.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./desk.jpeg&#34; width=&#34;60%&#34;/&gt;&lt;/p&gt;

&lt;h2 id=&#34;desk&#34;&gt;Desk&lt;/h2&gt;

&lt;p&gt;1970s Danish teak desk, bought on the Holloway Road some years back. Not in
perfect condition or anything, a desk for using. Hard to move around. Big, not
going to fit in a small space. I don&amp;rsquo;t remember the price, it wasn&amp;rsquo;t a lot and
it will last another 40 years. The lamp is a German, asymmetric one from the
1930s.&lt;/p&gt;

&lt;h2 id=&#34;computer&#34;&gt;Computer&lt;/h2&gt;

&lt;p&gt;MacBook Pro from a few years back. I have been wondering about getting a
desktop as, well, not going anywhere. However I want something silent and that
seems really difficult now. I do have a Linux box (carefully constructed with
large slow fans) and a FreeBSD FreeNAS box under the table, but although they
are fairly quiet I find them too noisy when working so I mostly keep them
switched off. The cloud is silent of course, a great advantage. I may go down
the silent PC building route again soon, will keep you posted.&lt;/p&gt;

&lt;h2 id=&#34;monitor&#34;&gt;Monitor&lt;/h2&gt;

&lt;p&gt;Dell 27 inch 4k monitor I took from the office. I thought it was too small on
this desk, but then realised I had it too far back. I would probably get a USB
C monitor now, just to get more ports nearby but this is fine. I don&amp;rsquo;t like
double monitors due to the gap, I prefer a single large one.&lt;/p&gt;

&lt;h2 id=&#34;internet-connection&#34;&gt;Internet connection&lt;/h2&gt;

&lt;p&gt;It is going to be difficult to improve quality for live conversations without a
good internet connection. Obviously there may not be much choice where you are
though, so changing this can be difficult. I use &lt;a href=&#34;https://www.aa.net.uk/&#34;&gt;Andrews and
Arnold&lt;/a&gt; with 80/20Mbs VDSL; they are a high quality
service with static IPs, IPv6, and they do not have oversubscription. It costs
a bit more than other providers.&lt;/p&gt;

&lt;h2 id=&#34;keyboard-and-trackpad&#34;&gt;Keyboard and trackpad&lt;/h2&gt;

&lt;p&gt;Need cleaning. Apple bluetooth ones. I also have a (noisy) Hacker&amp;rsquo;s keyboard
around. I much prefer trackpads to mice or trackballs now.&lt;/p&gt;

&lt;h2 id=&#34;dock&#34;&gt;Dock&lt;/h2&gt;

&lt;p&gt;Realising I was about to plug in more things than the computer has ports, I got
the &lt;a href=&#34;https://www.caldigit.com/ts3-plus/&#34;&gt;Caldigit TS3 Plus&lt;/a&gt; as recommended by
someone on Twitter. This provides power down one thunderbolt cable to the
computer, while having everything else plug into it. It has displayport for the
monitor, and wired ethernet, meaning I can avoid wifi issues. The wired
ethernet goes via ethernet over mains adaptors downstairs to the router. Note
that if you have the new MacBook Pro 16 inch, this consumes a peak 97W of power
which is more than this delivers although maybe there will be a firmware fix.
CPU peak power consumption is getting ridiculous now, 100W laptops!&lt;/p&gt;

&lt;h2 id=&#34;webcam&#34;&gt;Webcam&lt;/h2&gt;

&lt;p&gt;I managed to order a &lt;a href=&#34;https://www.logitech.com/en-us/product/streamcam&#34;&gt;Logitech
StreamCam&lt;/a&gt; direct from
Logitech just before all webcams sold out. It is excellent quality, see
pictures below. I sit it on top of the monitor, and it has USB C. It has a very
wide angle of view, but I eventually found out that the Logitech Camera
Settings App allows you to modify this, with a narrower setting too. This is
just a crop, so it is not as high quality. The Logitech software is much worse
on Mac than Windows it seems, with far less control available; some of the
Windows controls appear to be done in software with a software video out that
other applications can connect to which is not available on Mac. The Logitech
4k cameras apparently have three zoom options as well as ability to set frame
rates, and it looks like some stock may become available again, so these could
be better for a cropped view. Actually using the 4k option is not really
possible with most software at present though, and it requires lots of CPU to
encode.&lt;/p&gt;

&lt;p&gt;Having the camera above you on the monitor is way better than using the camera
on a laptop, which is generally low down unless you raise it up a lot; also as
you want to use a monitor generally the laptop is probably to the side, which
looks strange on calls. I don&amp;rsquo;t know why Apple do not improve the quality of
laptop cameras to match their phone cameras, and I have heard of people using
phones to stream.&lt;/p&gt;

&lt;p&gt;Another option a friend is exploring is using a digital camera; most recent
cameras can stream video although generally only via HDMI out so you need
something like the &lt;a href=&#34;https://www.elgato.com/en/gaming/cam-link-4k&#34;&gt;Elgato Cam
Link&lt;/a&gt; and these are also hard to
get now. With a choice of lenses and zoom and excellentpicture quality this is
an option if you already have a suitable camera; you probably want to use a
lens around 35mm it seems. You will need to mount it behind the monitor which
needs some work. Obviously this is a substantially more expensive option and
only makes sense if you have a camera already for other uses.&lt;/p&gt;

&lt;h2 id=&#34;lighting&#34;&gt;Lighting&lt;/h2&gt;

&lt;p&gt;Cameras are way better quality with lights. You might not immediately notice,
so here are some crops to give you an idea of low light versus a reasonable
light. I have the &lt;a href=&#34;https://www.elgato.com/en/gaming/key-light&#34;&gt;Elgato Key
Light&lt;/a&gt;, which is wifi controlled.
You probably need something this bright, I had a small LED panel and it was not
bright enough.&lt;/p&gt;

&lt;p&gt;The pictures below show crops of the video in the dark without lighting, with
light from the window only and lit with additional lighting.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./dark.png&#34; width=&#34;30%&#34;/&gt;&lt;img src=&#34;./shadow.png&#34; width=&#34;30%&#34;/&gt;&lt;img src=&#34;./lit.png&#34; width=&#34;30%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;The Key Light has a slightly annoying property of occasionally losing wifi
access and needing to be reset, although it stays on during this time, so I am
not sure I can entirely recommend it, although it hasn&amp;rsquo;t happened for a while
now. It is also expensive, but generally good. Lights are difficult to
buy. This clamps to the table which is good, as tripod type stands take up
loads of desk space or floor space around.&lt;/p&gt;

&lt;p&gt;I also have a window to the side, which provides most of the light during the
day, but I use the light at a lower level as a fill light, or else the side of
my face away from the window is very dark. At night I use the light as a key
light, and don&amp;rsquo;t use a fill, so it is a bit like Rembrandt lighting. Look at
&lt;a href=&#34;https://en.wikipedia.org/wiki/Three-point_lighting&#34;&gt;three-point lighting&lt;/a&gt; to
get an idea of how to place lights, you ideally want them diagonally
notdirectly in front, or else it looks very flat. I place the webcam a little
bit asymmetrically pointing into the room so it does not catch the very bright
window. The worst setup is if you have a window behind you, when the camera
will have a hard time, as you can see when having calls with people with that
setup.&lt;/p&gt;

&lt;h2 id=&#34;audio&#34;&gt;Audio&lt;/h2&gt;

&lt;p&gt;Audio gets complicated very fast. Your options are to use your laptop, or to
use the microphone on your webcam, which is what I was doing for a while, and
still do sometimes. There is another problem though about how to listen to the
audio, and avoiding the microphone picking up the sound of the other party, or
yourself. I had a bias towards audio/music equipment as I have used it in the
past a little and it is currently relatively easily available; there are very
different routes you could take here.&lt;/p&gt;

&lt;p&gt;The original recommendation from Bret was to get the &lt;a href=&#34;http://www.samsontech.com/samson/products/microphones/usb-microphones/q2u/&#34;&gt;Samson
Q2U&lt;/a&gt;,
but this remains totally unobtainable. Actually all USB microphones were
unobtainable. If you get a USB dynamic microphone, such as the Q2U or the
&lt;a href=&#34;https://eu.audio-technica.com/ATR2100-USB&#34;&gt;Audio Technica ATR2100&lt;/a&gt; which is
similar but more expensive (but maybe available now) then your route will be
simpler and cheaper than mine below.&lt;/p&gt;

&lt;p&gt;So I went the traditional route. Generally the advice seemed to be that unless
your room is a soundproofed studio, get a dynamic microphone not a condenser
microphone, as they are more directional and likely to mostly pick up your
voice not what is going on outside or downstairs or even the noise from your
keyboard. I went for the classic &lt;a href=&#34;https://www.shure.com/en-US/products/microphones/sm57&#34;&gt;Shure
SM57&lt;/a&gt; a microphone that
has been around so long it &lt;a href=&#34;https://en.wikipedia.org/wiki/Shure_SM57&#34;&gt;has its own Wikipedia
page&lt;/a&gt; and &lt;a href=&#34;https://www.theverge.com/2017/1/25/14384774/trump-microphone-speech-lon
g-neck-shure-sm57&#34;&gt;White House
stories&lt;/a&gt;. I ordered direct from the manufacturer which was very
quick; apparently there are a lot of fakes of these so it is worth buying from
a reputable place. You can&amp;rsquo;t see it clearly in the photo above as it is pointing
straight at me, as I am sitting it does not obstruct the view, but I can move
it away andback on the mic stand, see below.&lt;/p&gt;

&lt;p&gt;As the mic has XLR analogue outputs you need to plug it into the computer. The
easiest way is to get an audio interface, that combines a microphone pre-amp
and an analogue to digital converter. I got the &lt;a href=&#34;https://evo.audio/products/evo-4/overview/&#34;&gt;Audient
EVO4&lt;/a&gt;, which seems really nice and
excellent quality. Audient is a UK company that makes mixers and other
professional audio recording hardware; this is their &amp;ldquo;diffusion line&amp;rdquo; but has
the same high quality hardware. This also acts as a headphone amp, and can live
mix the audio from the mic into the headphone so you can listen to yourself
speaking. It supports two mics, or a mic and an instrument, and there is also a
four channel version, for a future world without social distancing when we are
in the same room again. There is only one potential issue with this
combination, which is that the microphone outputs at a very low level. The EVO4
has 58dB gain, which is quite a bit more than most units I looked at, but if
you have a quiet normal speaking voice and don&amp;rsquo;t project it, even if you have
the gain set to maximum, if you speak more than around two inches away from the
mic it is a little quieter than ideal. At around two inches away it is fine
although with some extra bass emphasis, or if you speak up a bit, but I am not
really used to doing either of those most of the time on calls. I should
probably get used to it; the
&lt;a href=&#34;https://pubs.shure.com/guide/SM57/en-US&#34;&gt;recommendation&lt;/a&gt; is to be less than
15cm away.&lt;/p&gt;

&lt;p&gt;I ended up, in the spirit of testing every option, getting a
&lt;a href=&#34;https://www.tritonaudio.com/fethead&#34;&gt;FetHead&lt;/a&gt; which is a tiny microphone
preamp that fits inline with the mic and provides an additional 27dB of gain,
powered from the preamp. This is designed for exactly this use case with
dynamic microphones. Adding it suddenly shifted from having to use max gain at
all times to being in the middle of the scale and having plenty of room to
adjust. It also cut the small low noise level even lower. I would say if your
preamp has less than 58dB of gain you would need this with this mic, otherwise
you could get away without it but it gives a little more flexibility. I chose
the EVO4 partly due to the fact it has relatively high gain, so you would get
more choice with the FetHead as any audio interface will be fine, although the
Evo4 is still a nice choice I think.&lt;/p&gt;

&lt;p&gt;Usually you are recommended to use headphones for audio recording, so as not to
record the output sounds along with input. Much software has echo cancellation
built in, and the Mac has some hardware cancellation, although that may just be
on the built in microphone and speakers. This means that you don&amp;rsquo;t necessarily
need to wear headphones for many use cases, although they will give you a
better idea of relative volume levels if you have multiple sources, and
depending on your exact setup and mic they will reduce echo or noise. Your
voice will sound a little different in the headphones than you are used to, but
there is no lag, and you get used to it. Having the audio in your headphones
stops you shouting which people tend to do with headphones as they cannot hear
themselves and compensate. A dynamic mic like the Shure is also fine for
recording with speakers even without cancellation, that is a normal stage
recording setup that they are often used for, ideally with the speakers at 65
degrees behind the mic as that is the zone of least sensitivity. I may well set
up some speakers later; the EVO4 has line out for speakers too. It is less
clear where to put the speakers on the desk though.&lt;/p&gt;

&lt;p&gt;You really want a mic &amp;ldquo;boom stand&amp;rdquo; with this setup so you can move the mic out
of the way, and then place it back in the right place, as mic placement is
important. I had no idea about stands and got the &lt;a href=&#34;https://neewer.com/products/microphones-accessories-90087662&#34;&gt;Neewer
NB-35&lt;/a&gt; which is
very cheap, and it does the job but it is a bit annoying as the part that holds
the mic is hard to keep at the right angle, and the whole thing moves in a
slightly annoying way. I may try a different one.&lt;/p&gt;

&lt;p&gt;I originally got the &lt;a href=&#34;https://www.audio-technica.com/cms/headphones/f6e3988012a67cd1/index.html&#34;&gt;Audio Technica
M30x&lt;/a&gt;
 headphones. These are not too expensive, and good quality closed ear
headphones, which block out external noise well. I did find that wearing them
for long periods made my ears hot and slightly squashed and they are not great
after an hour or so. I ended up getting open backed, around the ear headphones,
&lt;a href=&#34;https://en-us.sennheiser.com/best-audio-headphones-high-end-stereo-hifi-h
d-600&#34;&gt;Sennheiser
HD600&lt;/a&gt; which are way more comfortable to wear for long periods, and sound
great. As they aren&amp;rsquo;t closed, other people could hear you so you wouldn&amp;rsquo;t wear
them travelling or in a shared office, but if you have your own room to work in
this design works really well, if you don&amp;rsquo;t want total sound isolation and
noise cancellation (you can hear the doorbell ring, which is useful). You also
can hear yourself speak, although I do like a little microphone mixed in; you
could use these with any kind of microphone without a mixer, and some come with
built in mics.I tested recording while having music playing in the headphones,
and with the Shure mic the recording level even with quite loud music is
negligible with your head in the normal direction; if you point your ears at
the mic it clearly picks up the sound. With a less directional mic such as the
one in the webcam it picks up a quite a bit of the noise though.&lt;/p&gt;

&lt;p&gt;Overall I would say that with a dynamic microphone you get a lot more
flexibility in your headphone options. For recording something offline I would
probably use the closed ear headphones or not listen at all during the
recording (the EVO4 can show mic line level). For talking to other people and
daily use the open back headphones are so much more comfortable that they make
a lot of sense, and you can just switch from listening to music to making calls.&lt;/p&gt;

&lt;p&gt;I didn&amp;rsquo;t make any effort to choose portable equipment, as this is lockdown, but
other than the mic stand it is all relatively portable equipment. The EVO4 can
be plugged into an iPod with USB C, or an iPhone if you have the &lt;a href=&#34;https://www.apple.com/uk/shop/product/MK0W2ZM/A/lightning-to-usb-3-camera-adapter?fnode=97&amp;amp;fs=f%3Dadapter-apple-cable%26fh%3D458e%252B45c4%252B3214%252B45b0&#34;&gt;Lightning to
USB3 Camera
Adapter&lt;/a&gt;
which despite its name is a generic USB3 adapter that accepts input
power over another lightning port to power external devices that need
additional power that the phone won&amp;rsquo;t provide. I tested recording and playback
on my phone with this adapter and it worked fine.&lt;/p&gt;

&lt;p&gt;The best place I have found for buying audio equipment, other than ordering
direct from the manufacturer, is
&lt;a href=&#34;https://www.thomann.de/gb/index.html&#34;&gt;Thomann&lt;/a&gt;. They are a German family firm
but with a global online shop, and deliver fast and efficiently to the UK, and
their prices are a lot lower than Amazon.&lt;/p&gt;

&lt;h2 id=&#34;comparing-the-options&#34;&gt;Comparing the options&lt;/h2&gt;

&lt;p&gt;Below is a video of using internal camera and webcam, and internal mic, webcam
mic, airpods and the Shure mic. I used the Zoom cloud recording, so this gives
an idea of what someone would see and hear at the other end of a call with me,
rather than the best quality for local recording. Note that I had the window
open and a motorbike goes past a couple of times, but sadly not while I was
using each microphone, but I did type on the keyboard so you can hear what some
non directional noise pickup is like. Overall the audio quality and resistance
to noise pickup for the Shure SM57 is substantially better than any of the
other options. So be nice to your co-workers and improve your audio.&lt;/p&gt;

&lt;video width=&#34;640&#34; height=&#34;480&#34; controls poster=&#34;/desk.jpeg&#34;&gt;
  &lt;source src=&#34;./GMT20200426-154851_Justin-Cor_640x360.mp4&#34; type=&#34;video/mp4&#34;&gt;
Your browser does not support the video tag.
&lt;/video&gt; 

&lt;h2 id=&#34;linux&#34;&gt;Linux&lt;/h2&gt;

&lt;p&gt;I haven&amp;rsquo;t yet tested any of this equipment on Linux. I use my Linux machines as
servers not desktop machines at present. The EVO4 audio is a standard USB audio
device so should just work, and I think the Logitech cameras in base settings
are, but there may well be no control of settings, probably including crop, as
this is maybe not standard, I am not entirely sure. Probably best to check.&lt;/p&gt;

&lt;h2 id=&#34;is-it-worth-it&#34;&gt;Is it worth it?&lt;/h2&gt;

&lt;p&gt;Well, it is not necessary. As I spend a lot of time on calls and do quite a few
conference talks that will all be online for at least the next year or so, I think
improving the quality is worth it. The differences are noticeable as you can see
from the recordings. Audio quality makes a lot of difference to meetings, and I
would make that a priority if you want to work on something. Supply chains should
get better over the next few months so it should get easier to find more choices.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>WTFosdem</title>
      <link>https://www.cloudatomiclab.com/wtfosdem/</link>
      <pubDate>Sun, 26 Jan 2020 13:30:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/wtfosdem/</guid>
      <description>&lt;p&gt;Next weekend is &lt;a href=&#34;https://fosdem.org&#34;&gt;Fosdem&lt;/a&gt;, the largest open source event in
Europe. A lot of people will no doubt be coming for the first time, or thinking
about coming another year, so I thought it might be helpful to explain what it
is. Fosdem is not really like any other event, so Americans in particular find
it confusing, thinking it might be like OSCON or something. It is not. Of US
events I know, it is perhaps most like All Things Open, but it really is a
different thing. My qualifications for writing this are that I have been on and
off since about 2004. I worked there a few years, back when Greenpeace ran the
conference WiFi, before Cisco took over, and I have spoken once.&lt;/p&gt;

&lt;p&gt;The first practicality is you notice you don&amp;rsquo;t have to register, or indeed pay.
You should however donate (on site) if you can afford it, although they will
try to give you a really ugly t-shirt if you do. Most people do not donate, so
the conference relies on volunteers, the Université libre de Bruxelles which
gives the space, and, increasingly, corporate sponsors. The next practicality
is where to stay. The location is not very central, and while there is a tram
link it can get extraordinarily full. The best plan is to either stay within
walking distance, or to stay near the start of the tram line, which is near St
Catherine in the centre of Brussels. You can also use taxi/Uber but the sheer
number of people trying to get to and from the location can mean delays.
Brussels is one of my favourite cities in Europe, and along with my friends who
live there, one of the reasons I usually decide to attend. I highly recommend
you spend some time visiting the city. It is February though, so bring hat,
gloves and warm clothes. Some years it has been snowy and the hills get
slippery so be careful walking around, and allow extra time.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./brusselssnow.jpg&#34; alt=&#34;Brussels snow&#34; style=&#34;width:80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;The next practicality is that this conference is overwhelmingly attended by
white men. Most tracks will not have any women speakers. We know tech has a
diversity problem, but it is really in your face here more than other places.
Since 2016 there has at least been a &lt;a href=&#34;https://fosdem.org/2020/practical/conduct/&#34;&gt;code of
conduct&lt;/a&gt; after &lt;a href=&#34;http://www.sarahmei.com/blog/2015/02/01/the-fosdem-conundrum/&#34;&gt;Sarah Mei wrote
about it in
2015&lt;/a&gt;. Richard
Stallman attended as recently as 2016. Sarah&amp;rsquo;s piece says it &amp;ldquo;feels like 2007&amp;rdquo;,
and this is changing very slowly.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./fosdem.jpg&#34; alt=&#34;Fosdem attendees&#34; style=&#34;width:80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;Fosdem started as a developer meetup place, where distributed communities would
meet to hack on things, and talk about what they have done. So everything is
divided by project like grouping. There are a large number of rooms for talks,
but not enough for all the diversity of modern open source, so some years
projects like Perl that always used to have Fosdem community meetings don&amp;rsquo;t get
a room, and things get grouped where they used to be split, like &amp;ldquo;small
languages&amp;rdquo; or &amp;ldquo;desktop&amp;rdquo;. From an audience point of view thats better, and the
community meetings do tend to happen, in the hacking rooms, over meals and so
on. The traditional thing to do is sit in one room all day, but of course lots
of people are interested in learning about new things and want to wander
around. And some things are massively popular and in smallish rooms (most rooms
are smallish), such as the Go room in recent years.&lt;/p&gt;

&lt;p&gt;&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;So did fosdem decide
to give away deodorant yet, or is it still the same ol thing&lt;/p&gt;&amp;mdash; Jessie
Frazelle (@jessfraz) &lt;a
href=&#34;https://twitter.com/jessfraz/status/960177403904647168?ref_src=twsrc%5Etfw
&#34;&gt;February 4, 2018&lt;/a&gt;&lt;/blockquote&gt; &lt;script async
src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;So the talk you want to go to might well be full. Full means full, if the sign
is on the door it means you won&amp;rsquo;t get in. Remember all the talks are recorded
and streamed, with AV team is pretty amazing. Years ago only some of the rooms
were recorded, but now you won&amp;rsquo;t miss them. So have a backup plan. I remember a
particularly enjoyable we can&amp;rsquo;t get into the Go devroom meeting with Jaana and
others one year. Overall my strategy is generally to go to a few things at
random that might be interesting, maybe target a few specific ones that I
really want to go to (and go early maybe for the previoustalk) but not regret
if I can&amp;rsquo;t get in, and spend most of the time talking to people. The random
things can be great, that is how I started working on NetBSD and rump kernels,
after going to a talk pretty much because I thought a talk about testing
kernels might be interesting. You never know what paths you might go down in
future.&lt;/p&gt;

&lt;p&gt;Note there is a growing &lt;a href=&#34;https://fosdem.org/2020/fringe/&#34;&gt;Fringe&lt;/a&gt; of events
around Fosdem, both before, after and during. No dount, like with the Edinburgh
Festical, the Fringe will soon dwarf the original event.&lt;/p&gt;

&lt;p&gt;The whole event is really hectic, and there are going tobe maybe 6,000 people
there, maybe it is more. This gets overwhelming, so take time out for yourself.
I am only planning to attend on Saturday this year, and just to chill out in
Brussels on Sunday.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./bigtech.jpg&#34; alt=&#34;Posters from Fosdem&#34; style=&#34;width:80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;Fosdem has a strong culture of open source as freedom and as a political
statement, and there is widespread antipathy to corporate open source. For a
long time there was no real sign of the larger tech companies, but this has
changed in recent years, with Google and AWS sponsoring this year as is the
CNCF, and visible presence of more corporate and industry rather than
grassroots open source. You will meet people who don&amp;rsquo;t like this, don&amp;rsquo;t like
permissive licenses, and might object to your company&amp;rsquo;s open source policies.
In many ways this feels kind of refreshing&lt;/p&gt;

&lt;p&gt;Food is very important. Talks run all day, so you need to plan some time for
lunch. The quickest thing is the baguettes that are available at various
places, eg downstairs back of Jansson. They are very efficient about dispensing
these fast. There isn&amp;rsquo;t much choice. There are food trucks out front, with huge
queues at lunchtime. I usually go down the road to Le Pain Quotidian (eat in or
take away) in the small cluster of shops down the road. That is busy but less
so. There is really not much else around this area.&lt;/p&gt;

&lt;p&gt;Coffee is important too. There is a GitHub sponsored coffee stall that is good,
but it is free so the queue tends to be very long. The next best coffee is at
the cafeteria. Le Pain Quotidian does coffee too. If you want tea, on Saturday
this year OpenUK are serving tea and biscuits and Brexit commiseration on their
stand.&lt;/p&gt;

&lt;p&gt;Beer is a fixture at Fosdem. Belgium makes some of the finest beers in the
world, and some ok ones too. Beers are sold at several points in the venue, and
it is common to take them to talks and so on. Beware most Belgian beers are
strong. Also the kriek they sell at the venue is terrible, even though Belgium
makes some amazing examples of this beer style. There is a pre-conference &amp;ldquo;beer
event&amp;rdquo; on Friday, I haven&amp;rsquo;t even tried to go for many years, even though they
take over an entire street it is too crowded to be enjoyable or find anyone you
want to talk to. Yes, there are a lot of alcohol focused events, and events in
bars which could be offputting if you don&amp;rsquo;t drink.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./etoilevert.jpg&#34; alt=&#34;Antique shop&#34; style=&#34;width:80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;Brussels is a lovely city. The architecture is beautiful, both the old as
exemplified by Grand Place which is magical in the evening, and the art deco
gems, such as the Musical Instrument Museum, once a shop, and the diversity
everywhere. It is said that there is a rule about not copying buildings,
although I am not sure this is really the cause, but Belgium does not have
terraces of identical houses, but every building is totally different. The
Belgians are also as eccentric as the British, if not more so. Also don&amp;rsquo;t miss
the Galaries Royal St Hubert, the first glazed shopping street in Europe, from
1847.&lt;/p&gt;

&lt;p&gt;Perhaps my favourite area are the parts between Sablon, which has a grand
antique market and excellent chocolate shops, and the Marché aux Puces, the
flea market which is full of junk. In between are several streets lagely filled
with antique shops, selling midcentury furniture, and well everything. Some are
huge inside full of things and stuff of every kind just jumbled up anyhow.
There are often amazing window displays like the one below.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./scissors.jpg&#34; alt=&#34;window display&#34; style=&#34;width:50%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;Food in Brussels is really good, although Fosdem is not always the best time to
eat as you are often with indeterminate amounts of people and getting
reservations, which are often needed on Friday and Saturday nights, is hard.
Also most places are small. Brussels is very international, and all kinds of
food are available there. While most people just think that Belgian food is
frites with mayonnaise and waffles, but there is both French and Flemish food
that are traditional, and great seafood, not just mussels. The local beer is
lambic beer, the sourdough of beer styles made with wild yeast that is only
made in the region. Cantillon is one of the best, and has an amazing museum in
the working brewery in Brussels. This styleof beer is sour, but it is
absolutely delicious. If you love this style, Moeder Lambic is a great place to
try it. There have been a number of new breweries open recently, de la Senne is
excellent and available in good bars.&lt;/p&gt;

&lt;p&gt;This year, the Friday night before Fosdem is Brexit. Brussels has a large UK
community, and Fosdem always has a large UK contingent, with whole Eurostar
trains being filled on Friday evening usually. So be nice to any of us you see.&lt;/p&gt;

&lt;p&gt;So, yeah, that is Fosdem. Unique. Could be better. Enjoy Brussels.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./market.jpg&#34; alt=&#34;Brussels market&#34; style=&#34;width:80%&#34;/&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Linearity among the toctou</title>
      <link>https://www.cloudatomiclab.com/toctou/</link>
      <pubDate>Fri, 01 Nov 2019 17:27:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/toctou/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;./lineforawalk.png&#34; width=40%&#34;&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Illustration from Paul Klee, Pedagogical Sketchbook, 1925&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I have been reading a lot of papers on linear types recently. Originally it was
to understand better why Rust went down the path it did, but I found a lot more
interesting stuff there. While some people now are familiar with linear typesas
the basis for Rust&amp;rsquo;s memory management, they have been around for a long time
and have lots of other potential uses. In particular they are interesting for
improving resource allocation in functional programming languages by reusing
storage in place where possible. Generally they are useful for reasoning about
resource allocation. While the Rust implementation is probably the most widely
used at present, it kind of obscures the underlying simple principles by adding
borrowing, so I will only mention it a little in this post.&lt;/p&gt;

&lt;p&gt;So what are linear types? I recommend you read &lt;a href=&#34;https://dl.acm.org/citation.cfm?doid=199818.199860&#34;&gt;“Use-once” variables and linear
objects: storage management, reflection and
multi-threading&lt;/a&gt; by Henry
Baker, as it is the best general overview I have found. The basic idea is
extremely simple, linear variables can only be used once, so any function that
receives one must either return it, or pass it to another function that
consumes it. Using values only once sounds kind of weird and restrictive, but
there are some ways it can be made easier. Some linear types may have an
explicit copy operation to duplicate them, and others may have operations that
return a new value, in a sequential way. For example a file object might have a
read operation that returns the portion read and a new linear object to read
for the next part, preserving a functional model: side effects are fine if you
cannot reuse a variable. You won&amp;rsquo;t really recognise much of the Rust model
here, as it allows borrows, which presents a much less austere effect. It does
all sound fairly odd until you get used to it, even though it is simpler than
say monads as a way of sequencing. Note also that there are related affine
types,where you can use values zero or one times, so values can be discarded,
and other forms such as uniqueness types, and many other fun variants in the
literature.&lt;/p&gt;

&lt;p&gt;Memory is probably the easiest way to understand the use cases. Think about
variables as referring to a chunk of memory, rather than being a pointer.
Memory can be copied, but it is an explicit relatively costly operation (ie
&lt;code&gt;memcpy&lt;/code&gt;) on the memory type, so the normal access should be linear with
explicit copying only if needed. Because the value of the memory may be changed
at any time by a write, you need to make sure there are not multiple writers or
readers that are not reading in a deterministic order. Rust does this with
mutable borrows, and C++ has a related thing with move semantics.&lt;/p&gt;

&lt;p&gt;Rust&amp;rsquo;s borrow checker allows either a single reference with read and write
access, or multiple readers when there is no write access. Multiple readers is
of course not a linear access pattern, but is safe as multiple reads of an
immutable object return the same value. The complexity of the borrow checker
comes from the fact that objects can change between these states, which
requires making sure statically that all the borrows have finished. Some of the
use cases for linearity in functional languages relate to this, such as
efficiently initialising an object that will be immutable later, so you want
linear write access in the initialisation phase, followed by a non linear read
phase. There are definitely interesting language tradeoffs in how to expose
these types of properties.&lt;/p&gt;

&lt;p&gt;Anyway, I was thinking about inter process communication (IPC) again recently,
in particular ring buffer communication between processes, and it occured to me
that this is another area where linearity is a useful tool. One of the problems
with shared memory buffers for communication, where one process has read access
and the other write access for each direction of communication is that the
writing process may try to attack the reader by continuing to write after
reading has started. The same issue applies for userspace to kernel
communication, where another userspace thread may write to a buffer that the
kernel has already read. This is to trigger a time of check time of use
(toctou) attack, for example if there is a check that a size is in range, but
after that the attacker increases it. The standard defence is to copy buffers
to a private buffer, where validation may happen undisturbed. This of course
has a performance hit, but many IPC implementations, and the Linux kernel, do
this for security reasons.&lt;/p&gt;

&lt;p&gt;Thinking about toctou as a linearity problem, we can see that &amp;ldquo;time of check&amp;rdquo;
and &amp;ldquo;time of use&amp;rdquo; are two different reads, and if we treat the read buffer as a
linear object, and require that its contents are each only read once, then time
of check and time of use cannot be different. Note of course that it does not
matter exactly which version gets read, all that matters is that it is a
consistent one. We have to remember the value of the part we check and keep
that for later if we can&amp;rsquo;t use it immediately. So linear read has its uses. Of
course it is not something that programming languages give us at present,
generally a compiler will assume that it can reload from memory if it needs to.
Which is why copying is used; copying is a simple linear operation that is
available. But there are often cases where the work being done on the buffer
can be done in a linear way without copying, if only we had a way of telling
the compiler or expressing it in the language.&lt;/p&gt;

&lt;p&gt;Overall, I have found the linear types literature helpful in finding ways to
think about resource allocation, and I would recommend exploring in this space.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Fuzz rising</title>
      <link>https://www.cloudatomiclab.com/fuzz/</link>
      <pubDate>Sun, 21 Jul 2019 20:52:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/fuzz/</guid>
      <description>&lt;p&gt;Go and read the excellent &lt;a href=&#34;https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2
019/&#34;&gt;blog post from Cloudflare on their recent
outage&lt;/a&gt; if you haven&amp;rsquo;t already.&lt;/p&gt;

&lt;p&gt;I am not going to talk about most of it, just a few small points that
especially interest me right now, which are definitely not the most important
things from the outage point of view. This post got a bit long so I split it
up, so this is part one.&lt;/p&gt;

&lt;p&gt;Fuzz testing has been around for quite some time. &lt;a href=&#34;http://lcamtuf.coredump.cx/afl/&#34;&gt;American Fuzzy
Lop&lt;/a&gt; was released in 2013, and was the first
fuzzer to need very little configuration to find security issues. &lt;a href=&#34;https://users.ece.cmu.edu/~sangkilc/papers/oakland15-cha.pdf&#34;&gt;This paper
on mutational
fuzzing&lt;/a&gt; is a
starting point if you are interested in the details of how this works. The
basic idea is that you start with a valid input, and gradually mutate it,
looking for &amp;ldquo;interesting&amp;rdquo; changes that change the path the code takes. This is
often coverage guided, so that you attempt to cover all code paths by changing
input data.&lt;/p&gt;

&lt;p&gt;Fuzz testing is not the only tool in the space of automated security issue
detection. There is traditional static analysis tooling, although it is
generally not very efficient at finding most security issues, other than a few
things like SQL injection that are often well covered. It tends to have a high
false positive rate, and unlike fuzz testing will not give you a helpful test
case. Of course there are many other things to consider in comprehensive
security testing, &lt;a href=&#34;https://www.bsimm.com/framework/software-security-development-lifecycle/
software-security-testing.html&#34;&gt;this list of considerations is very
useful&lt;/a&gt;. Another technique is automated variant
analysis, taking an existing issue and finding other cases of the same issue,
as done by &lt;a href=&#34;https://semmle.com/variant-analysis&#34;&gt;platforms such as Semmle&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Fuzzing as a service is available too. Operationally fuzzing is not something
you want to run in your CI pipeline, as it is not a test that finishes, it is
something that you should run continuously &lt;sup&gt;24&lt;/sup&gt;&amp;frasl;&lt;sub&gt;7&lt;/sub&gt; on the latest version of your
code to find issues, as itstill takes a long time to find issues, and is
randomised. Services include &lt;a href=&#34;https://fuzzbuzz.io/&#34;&gt;Fuzzbuzz&lt;/a&gt; a fairly new
commercial service (with a free tier) who are very friendly, &lt;a href=&#34;https://www.microsoft.com/en-us/security-risk-detection/&#34;&gt;Microsoft
Security Risk
Detection&lt;/a&gt; and
Google&amp;rsquo;s &lt;a href=&#34;https://github.com/google/oss-fuzz/&#34;&gt;OSS-Fuzz&lt;/a&gt; for open source
projects.&lt;/p&gt;

&lt;p&gt;As Cloudflare commented &amp;ldquo;In the last few years we have seen a dramatic increase
in vulnerabilities in common applications. This has happened due to the
increased availability of software testing tools, like fuzzing for example.&amp;rdquo;
Some numbers give an idea of the scale: as of January 2019, Google&amp;rsquo;s
ClusterFuzz has found around 16,000 bugs in Chrome and around 11,000 bugs in
over 160 open source projects integrated with OSS-Fuzz. We can see the knock on
effect on the rate of CVEs being reported.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./number-of-CVEs-per-year.png&#34; width=&#34;80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;If we look at the kinds of issues found, data from &lt;a href=&#34;https://security.googleblog.com/2017/05/oss-fuzz-five-months-later-and.htm
l&#34;&gt;a 2017 Google blog
post&lt;/a&gt; the breakdown is interesting.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./fuzzissues.jpg&#34; width=&#34;80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;As you can see a very large proportion are buffer overflows, manual memory
management issues like use after free, and the
&amp;ldquo;&lt;a href=&#34;https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html&#34;&gt;ubsan&lt;/a&gt;&amp;ldquo;
category, which is all the stuff in C or C++ code that if you happen to write
it the compiler can turn your program into hot garbage if it feels like it.
Memory safety is still a major cause of errors, as you can see if you follow
the &lt;a href=&#34;https://twitter.com/LazyFishBarrel&#34;&gt;@LazyFishBarrel&lt;/a&gt; twitter account. Note
that the majority of projects are still not running comprehensive automated
testing for these issues, and this problem is rapidly increasing. Note that
there are two factors at play: first, memory errors are an easier target than
many other sorts of errors to find with current tooling, but second there is a
huge codebase that has huge numbers of these errors.&lt;/p&gt;

&lt;p&gt;Microsoft Security Response Center also just &lt;a href=&#34;https://msrc-blog.microsoft.com/2019/07/16/a-proactive-approach-to-more-se
cure-code/&#34;&gt;released a blog
post&lt;/a&gt; with some more numbers. While ostensibly about Microsoft&amp;rsquo;s
gradually increasing coding in Rust, the important quote is that &amp;ldquo;~70% of the
vulnerabilities Microsoft assigns a CVE each year continue to be memory safety
issues&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;In my talk at Kubecon I touch on some of these issues with C (and to some
extent C++) code. The majority of the significant issues found in the CNCF
security audits were in C or C++ code, despite the fact there is not much of
the is code in the reviewed projects.&lt;/p&gt;

&lt;iframe width=&#34;560&#34; height=&#34;315&#34;
src=&#34;https://www.youtube.com/embed/0BkKpsrUo5k&#34; frameborder=&#34;0&#34;
allow=&#34;accelerometer; encrypted-media; gyroscope; picture-in-picture&#34;
allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;Most of the C and C++ code that causes the majority of open source CVEs is
shipped in Linux distributions. Linux distros are the de facto package manager
for C code, and C++ to a lesser extent; neither of these langauges have
developed their own language specific package management yet. From the &lt;a href=&#34;https://sources.debian.org/stats/&#34;&gt;Debian
stats&lt;/a&gt;, of the billion or so lines of code,
43% is ANSI C and 24% is C++ which has many of the same problems in many
codebases. So 670 &lt;a href=&#34;https://informationisbeautiful.net/visualizations/million-lines-of-code/&#34;&gt;million lines of
code&lt;/a&gt;,
in general without enough maintainers to deal with the existing and coming
waves of security issues that fuzzing will find. This is the backdrop of
increasing complaints about unfixed CVEs in Docker containers, where these tend
to me more visible due to wider use of scanning tools.&lt;/p&gt;

&lt;p&gt;Is it worth fuzzing safer languages such as Go and Rust? Yes, you will still
find edge conditions, and potentially other cases such as race conditions,
although the payoff will not be nearly as high. For C code it is absolutely
essential, but bugs and security issues are found elsewhere. Oh and &lt;a href=&#34;https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html&#34;&gt;fuzzing is
fun&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;My view is that we are just at the beginning of this spike, and we will not
just find all the issues and move on. Rather we will end up with the Linux
distributions, which have this code will end up as toxic industrial waste
areas, the &lt;a href=&#34;https://themorningnews.org/gallery/permanent-error&#34;&gt;Agbogbloshie&lt;/a&gt;
of the C era. As the incumbents, no they will not &lt;a href=&#34;https://www.youtube.com/watch?v=HgtRAbE1nBM&#34;&gt;rewrite it in
Rust&lt;/a&gt;, instead smaller more nimble
different types of competitor will outmanouvre the
&lt;a href=&#34;https://newsroom.ibm.com/2019-07-09-IBM-Closes-Landmark-Acquisition-
of-Red-Hat-for-34-Billion-Defines-Open-Hybrid-Cloud-Future&#34;&gt;dinosaurs&lt;/a&gt;. Linux distros
generally consider that most of their role is packaging not creation, with a
few exceptions like Systemd; most of their engineering work is in the long term
support business, which still pays well despite being increasingly out of step
with how non-C software is used, and how cloud deployments work, where updating
software is part of normal life, and five or ten year software lifetimes
without updates are not the target. We are not going to see the Linux distros
work on solving this issue.&lt;/p&gt;

&lt;p&gt;Is this code exploitable? Almost certainly yes with sufficient effort. We
discussed Thomas Dulien&amp;rsquo;s paper &lt;a href=&#34;http://www.dullien.net/thomas/weird-machines-exploitability.pd
f&#34;&gt;Weird machines, exploitability, and provable
unexploitability&lt;/a&gt; at the &lt;a href=&#34;https://events.com/r/en_US/registration/santis-systems-summit-19-schwagalp-june-757708&#34;&gt;Säntis Systems
Summit&lt;/a&gt;
recently, I highly recommend it if you are interested in
exploitability. But overall, proving code is not exploitable is in general not
going to be possible, and attackers always have the advantage. Sure they will
pick the easiest things first, but most attacks are automated now and attacking
scales well. Security is risk management, but with memory safety being a
relatively easy exploit in many cases, it is a high risk. Obviously not all
this code is exposed to attackers via network or attacker supplied data,
especially in containerised environments, but some is, and you will spend
increasing amounts of time working out what is a risk. The sheer volume of
security issues just makes risk management more difficult.&lt;/p&gt;

&lt;p&gt;If you are a die hard C hacker and want to remain one, the last bastion of C is
of course OpenBSD. Throw up the &lt;code&gt;pledge&lt;/code&gt; barricades, remove anything you can,
keep reviewing. That is the only heroic path left.&lt;/p&gt;

&lt;p&gt;In the short term, start to explore and invest in ways to replace every legacy
C dependency you are currently using. Write a deprecation roadmap. Cut down
your dependencies on Linux distributions. Shift to memory safe languages
everywhere, and if you use C++ make sure you only use the safer subset. Look to
smaller more nimble Linux distributions that start shipping memory safe code;
although the moves here have been slow so far, you only need a little as once
distros stop having to be C package managers they can do a better job of being
minimal userspaces. There isn&amp;rsquo;t much code you really need to run modern
applications that themselves do not have many C dependencies, as
implementations like LinuxKit show. If you just sit on top of the kernel, using
its ABI stability guarantees there is little you need to do other than a little
configuration; well other than worry about the bugs in a kernel written in &amp;hellip; C.&lt;/p&gt;

&lt;p&gt;Memory unsafe languages are not going to get better, or safe. It is time to move on.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Kubernetes as an API standard</title>
      <link>https://www.cloudatomiclab.com/rustyk8s/</link>
      <pubDate>Sun, 27 Jan 2019 19:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/rustyk8s/</guid>
      <description>&lt;p&gt;&lt;em&gt;There is now a &lt;a href=&#34;http://lists.opendev.org/cgi-bin/mailman/listinfo/rustyk8s&#34;&gt;rustyk8s mailing list&lt;/a&gt;
to discuss implementations of the Kubernetes API in Rust.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There was a lot of interest in my tweet a couple of months about writing an
implementation of the Kubernetes API in Rust. I had a good conversation at
Kubecon with some people about it, and thought I should explain more why it is
interesting.&lt;/p&gt;

&lt;p&gt;&lt;blockquote class=&#34;twitter-tweet&#34; data-lang=&#34;en-gb&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;So,
it seems the competition is on. Most complete implementation of the Kubernetes
API in Rust by May 20 2019 (kubecon EU). Prize details forthcoming. Tiebreaker
is smallest codebase.&lt;/p&gt;&amp;mdash; Justin Cormack (@justincormack) &lt;a 
href=&#34;https://twitter.com/justincormack/status/1067185866001575936?ref_src=twsrc
%5Etfw&#34;&gt;26 November 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; 
charset=&#34;utf-8&#34;&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;Kubernetes is an excellent API for running code reliably. So much so that
people want to run it everywhere. People have described it as the universal
distributred systems API, and something that will eventually be embedded into
hardware, or the kernel (or Linux) of distributed systems. Maybe some of these
are ambitious, but nothing wrong with ambition, and hey it is a nice, simple
API at its core. Essentially it just does reconciliation between the world and
desired state for an extensible set of things, things that include a concept of
a pod by default. That is pretty much it, a simple idea.&lt;/p&gt;

&lt;p&gt;A simple idea, but not simply expressed. If you build a standalone Kubernetes
system, somehow that simple idea amounts to a gigabyte of compiled code. Sure,
there are some extraneous debug symbols, and a few extra versions of etcd for
version upgrades, and maybe one day Go will produce less bloated code, but that
is not going to cut it for embedded systems and other interesting potential use
cases of Kubernetes. Nor is it easy to understand, find your way around the
code and hack on it.&lt;/p&gt;

&lt;p&gt;Another problem with Kubernetes is that it suffers from the problem that the
implementation is the specification. Lots of projects start like that but as
they mature the specification is often separated, and alternative
implementations can thrive. Without an independent specification, alternative
implementations often have to copy every accidental nuance of the original, and
even replicate bugs. Kubernetes is in the right state where starting to move
towards an independent specification would be productive. We know that there
are some rough edges in the implementation that need to be cleared up, and some
parts where the API is not yet the best it could be.&lt;/p&gt;

&lt;p&gt;One approach is to try to cut back the current implementation to a more
manageable size, by removing parts. This is what Darren Shepherd of Rancher has
done with &lt;a href=&#34;https://github.com/ibuildthecloud/k3s&#34;&gt;&amp;ldquo;k3s&amp;rdquo;&lt;/a&gt;, removing a million or
so lines of code. But a second, complementary approach is to build a new simple
implementation from the ground up without any baggage to start with. Then by
looking at differences in behaviour, you can start to understand which parts
are the core specification, and which parts are accidental. Given that the way
the code for Kubernetes is written has been described as a &amp;ldquo;clusterfuck&amp;rdquo; by
&lt;a href=&#34;https://fosdem.org/2019/schedule/event/kubernetesclusterfuck/&#34;&gt;Kris Nova&lt;/a&gt;,
this seems a productive route: &amp;ldquo;Unknown to most, Kubernetes was originally
written in Java&amp;hellip; If the anti patterns weren’t enough we also observe how
Kubernetes has over 20 main() functions in a monolithic “build” directory&amp;hellip;
Kubernetes successfully made vendoring even more challenging than it already
was, and discuss the pitfalls with this design. We look at what it would take
to begin undoing the spaghetti code that is the various Kubernetes binaries.&amp;rdquo;&lt;/p&gt;

&lt;p&gt;Of course we could write a new implementation in Go, but the temptation would
then be to import bunches of existing code, and it might not end up that
different. A different language makes sense to stop that. The aim should be to
build the minimum needed to implement the code API. So what language? Rust
makes the most sense it seems, although there are some other options.&lt;/p&gt;

&lt;p&gt;There is a small but growing community of cloud native Rust projects. In the
CNCF, there is &lt;a href=&#34;https://github.com/tikv/tikv&#34;&gt;TiKV&lt;/a&gt; from PingCAP and &lt;a href=&#34;https://github.com/linkerd/linkerd2-proxy&#34;&gt;the
Linkerd 2 data plane&lt;/a&gt;. Anther
project that has recently been launched in the space is &lt;a href=&#34;https://github.com/firecracker-microvm/firecracker&#34;&gt;AWS
Firecracker&lt;/a&gt;. The Rust
ecosystem is especially strong in security, and control of memory usage, both
of which are important for effective scalable systems. In the last year or so
the core libraries needed in the cloud native space have really been filled in.&lt;/p&gt;

&lt;p&gt;So are you interested in hacking on a greenfield implementation of Kubernetes
in Rust? There is not yet a public codebase to hack on, but I know that there
are some people hacking in private. The minimal viable project is something
that you can talk to with kubectl and run pods, and API extensions. The
conformance tests should help, although they are not complete enough to
constitute a specification by any means, but starting to pass some tests would
be a satisfying achievement. If you want to meet up with cloud native Rust
community, a bunch of people will be at &lt;a href=&#34;https://www.fosdem.org&#34;&gt;Fosdem&lt;/a&gt; in
early February, and I will sort out a fringe even at KubeCon EU as well. Happy
hacking!&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Why RISC-V?</title>
      <link>https://www.cloudatomiclab.com/risc-v/</link>
      <pubDate>Tue, 01 Jan 2019 18:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/risc-v/</guid>
      <description>&lt;p&gt;You might have noticed me tweeting a bunch about RISC-V in recent months. It is
actually something I have been following for several years now, since the
formation of &lt;a href=&#34;https://www.lowrisc.org/&#34;&gt;LowRISC&lt;/a&gt; in Cambridge quite some time
ago, but this year has suddenly seen a huge maturing of the ecosystem.&lt;/p&gt;

&lt;p&gt;In case you have been sitting under a rock hacking on something for some time,
RISC-V is an open instruction set for CPUs. It is pronounced &amp;ldquo;risk five&amp;rdquo;. It
looks a bit like MIPS, if you know your instruction sets, and yes it is very
RISC, pretty minimal really. It is designed to be cleanly extended, and has 32,
64 and 128 bit implementations. So far the 32 bit version is for
microcontrollers, the 64 bit for operating systems like Linux with MMUs, and
the 128 bit version is for future dreams.&lt;/p&gt;

&lt;p&gt;But an instruction set, even one without licensing and patent issues, is not
that interesting on its own. There are some other options there after all,
although they all have some issues. What is more interesting is that there are
open and freely modifiable open source implementations. Lots of them. There are
proprietary ones too, and hybrid ones with some closed IP and some open, but
the community has been building open. Not just open cores, but new open
toolchains (largely written in Scala) for design, test, simulation and so on.&lt;/p&gt;

&lt;figure&gt;
&lt;a href=&#34;https://scs.sifive.com/core-designer/&#34;&gt;&lt;img width=&#34;80%&#34; 
src=&#34;/sifive.png&#34;&gt;&lt;/a&gt;
&lt;figcaption&gt;SiFive &lt;a href=&#34;https://scs.sifive.com/core-designer/&#34;&gt;core designer&lt;/a&gt;&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;The size of the community growth this year has been huge, starting with the
&lt;a href=&#34;https://archive.fosdem.org/2018/schedule/event/riscv/&#34;&gt;launch by SiFive&lt;/a&gt; of
the first commercially available RISC-V machine that could run Linux at Fosdem
in January. Going to a RISC-V meetup (they are springing up in &lt;a href=&#34;https://www.meetup.com/Bay-Area-RISC-V-Meetup/&#34;&gt;Silicon
Valley&lt;/a&gt;,
&lt;a href=&#34;https://www.meetup.com/Cambridge-RISC-V-Meetup-Group/&#34;&gt;Cambridge&lt;/a&gt;,
&lt;a href=&#34;https://www.meetup.com/Bristol-RISC-V-Meetup-Group/&#34;&gt;Bristol&lt;/a&gt; and
&lt;a href=&#34;https://www.meetup.com/Israel-RISC-V-meetups/&#34;&gt;Israel&lt;/a&gt;) you feel that this is
hardware done by people who want to do hardware like open source software is
done. People are building cores, running in silicon or on FPGA, tooling, secure
enclaves, operating systems, VC funded business and revenue funded businesses.
You meet people from Arm at these meetups, finding out what is going on, while
Intel is funding RISC-V businesses, as if they want to make serious competition
for Arm or something! Meanwhile MIPS has opened its ISA as a somewhat late
reaction.&lt;/p&gt;

&lt;p&gt;A few years ago RISC-V was replacing a few small microcontrollers and custom
CPUs, now we see companies like Western Digital announcing they will switch all
their cores to RISC-V, while opening their designs. There are lots of AI/TPU
cores being built with RISC-V cores, and &lt;a href=&#34;https://www.esperanto.ai/&#34;&gt;Esperanto&lt;/a&gt;
is building chips with over a thousand 64 bit RISC-V cores on. The market for
specialist AI chips came along at the same time as RISC-V was maturing, and it
was a logical new market.&lt;/p&gt;

&lt;p&gt;RISC-V is by no means mature; it is forecast it will ship 10-100 million cores in 2019,
the majority of them 32 bit microcontrollers, but that adds to the interest, it
is at the stage where you can now start building things, and lots of people are
building things for fun or serious reasons, or porting code, or developing
formal ISA models or whatever. Open source wins because a huge community just
decides it is the future and rallies around every piece of the ecosystem. 2018
was the year that movement became really visible for RISC-V.&lt;/p&gt;

&lt;p&gt;I haven&amp;rsquo;t started hacking on any RISC-V code yet, but I have an idea for a
little side project, but I have joined the RISC-V Foundation as an individual
member and hope to get to the &lt;a href=&#34;https://tmt.knect365.com/risc-v-workshop-zurich/&#34;&gt;RISC-V
Workshop&lt;/a&gt; in Zurich and
several meetups. See you there and happy hacking!&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>2018 Conferences</title>
      <link>https://www.cloudatomiclab.com/conferences-2018/</link>
      <pubDate>Fri, 28 Dec 2018 14:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/conferences-2018/</guid>
      <description>

&lt;p&gt;I gave quite a few talks this year, and also organized several conference tracks.&lt;/p&gt;

&lt;h1 id=&#34;config-mangement-camp&#34;&gt;Config Mangement Camp&lt;/h1&gt;

&lt;p&gt;It was an excellent Config Management Camp this year, and fun to speak at.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Justin Cormack &lt;a href=&#34;https://www.youtube.com/watch?v=DmcSo1Wts0Q&#34;&gt;Making Immutable Infrastructure Simpler with LinuxKit&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;qcon-london-2018&#34;&gt;QCon London 2018&lt;/h2&gt;

&lt;p&gt;I organized the Modern Computer Science in the Real World Track at this conference, it was a great set of talks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Heidi Howard &lt;a href=&#34;https://www.infoq.com/presentations/future-distributed-consensus&#34;&gt;Distributed Consensus&lt;/a&gt; excellent understandable explanation of distributed consensus&lt;/li&gt;
&lt;li&gt;Michael Tautschnig, Formal Methods at AWS was not recorded you had to be there, was really interesting about formal methods in practise&lt;/li&gt;
&lt;li&gt;Gil Tene, &lt;a href=&#34;https://www.infoq.com/presentations/java-jvm-perf&#34;&gt;Java at Speed&lt;/a&gt; on mechanical sympathy&lt;/li&gt;
&lt;li&gt;Martin Kleppmann, &lt;a href=&#34;https://www.infoq.com/presentations/crdt-distributed-consistency&#34;&gt;CRDTs and the Quest for Distributed Consistency&lt;/a&gt; everything you wanted to know about CRDTs from an amazing speaker&lt;/li&gt;
&lt;li&gt;Moritz Lipp, &lt;a href=&#34;https://www.infoq.com/presentations/spectre-meltdown-security&#34;&gt;How Performance Optimizations Shatter Security Boundaries&lt;/a&gt; Moritz is one of the people who discovered Spectre and Meltdown.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also spoke in the Modern Operating Systems track&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Justin Cormack &lt;a href=&#34;https://www.infoq.com/presentations/linuxkit-agile-os&#34;&gt;The Modern Operating System in 2018&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;kubecon-cloud-native-europe&#34;&gt;KubeCon Cloud Native Europe&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Nassim Eddequiouaq and Justin Cormack &lt;a href=&#34;https://www.youtube.com/watch?v=Jbqxsli2tRw&#34;&gt;Understandable Security Controls&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;dockercon&#34;&gt;DockerCon&lt;/h2&gt;

&lt;p&gt;Registration required to watch videos.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Luke Marsden and Justin Cormack &lt;a href=&#34;https://dockercon2018.hubs.vidyard.com/watch/RXqDAs344Xy2ruBqfppMnu&#34;&gt;A Vision of Persistence&lt;/a&gt; this was a great fun collaboration!&lt;/li&gt;
&lt;li&gt;Liz Rice and Justin Cormack &lt;a href=&#34;https://dockercon2018.hubs.vidyard.com/watch/YKc9x3LULa4bDrDyFZXrgP&#34;&gt;Don&amp;rsquo;t Have a Meltdown&lt;/a&gt; as was this!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;oscon&#34;&gt;Oscon&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Justin Cormack &lt;a href=&#34;https://www.oreilly.com/library/view/oscon-2018-/9781492026075/video321466.html&#34;&gt;Immutable Infrstructure: Continuous Delivery for Systems&lt;/a&gt; was supposed to be with Rolf Neugebauer but sadly he could not make it in the end.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;all-things-open&#34;&gt;All Things Open&lt;/h2&gt;

&lt;p&gt;I don&amp;rsquo;t think this was recorded.&lt;/p&gt;

&lt;h2 id=&#34;qcon-sf&#34;&gt;QCon SF&lt;/h2&gt;

&lt;p&gt;I curated the Modern Operating Systems track, and spoke on it. The &lt;a href=&#34;https://qconsf.com/2018-video-schedule&#34;&gt;videos are coming out on 7 and 14 January&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thomas Graf, How to Make Linux Microservice-Aware With Cilium and eBPF&lt;/li&gt;
&lt;li&gt;Alan Kasindorf, Caching Beyond RAM: The Case for NVMe&lt;/li&gt;
&lt;li&gt;Justin Cormack, The Modern Operating System in 2018, a somewhat changed version of my QCon London talk&lt;/li&gt;
&lt;li&gt;Adin Scannell, gVisor: Building and Battle Testing a Userspace OS in Go&lt;/li&gt;
&lt;li&gt;Bryan Cantrill, Is It Time to Rewrite the Operating System in Rust? (Don&amp;rsquo;t miss this!)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;dockercon-eu&#34;&gt;DockerCon EU&lt;/h2&gt;

&lt;p&gt;Registration required to watch videos. I helped organize the Black Belt track which had some great talks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Steven Follis and Israel Vega &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=avoiding-an-identity-crisis-authentication-with-windows-serv&#34;&gt;Avoiding an Identity Crisis: Authentication with Windows Server Containers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Amn Rahman and Roberto Hashioka &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=categorizing-docker-hub-s-public-images-end-to-end-machine-l&#34;&gt;Categorizing Docker Hub&amp;rsquo;s Public Images: End-to-End Machine Learning Pipeline with Docker Enterprise and Kubeflow&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Sean Gillespie &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=program-the-cloud-using-containers-as-the-building-block&#34;&gt;Programming the Cloud using containers as a building block&lt;/a&gt; this talk was really excellent, really excited about Pulumi&lt;/li&gt;
&lt;li&gt;Jaana B Dogan &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=monitoring-and-debugging-containerized-systems-at-scale&#34;&gt;Monitoring and Debugging Containerized Systems at Scale&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Matt Butcher and Gareth Rushgrove &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=secret-session-introducing-cloud-native-application-bundles&#34;&gt;Introducing Cloud Native Application Bundles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Ian Campbell and Tonis Tiigi &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=supercharged-docker-build-with-buildkit&#34;&gt;Supercharged Docker Build with BuildKit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Christopher Crone &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=extending-kubernetes-moving-compose-on-kubernetes-from-a-crd&#34;&gt;Extending Kubernetes: Moving Compose on Kubernetes from a CRD to API Aggregation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I gave a joint talk on Open Policy Agent and a re-run of the earlier talk with Liz Rice&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Torin Sandall and Justin Cormack &lt;a href=&#34;https://europe-2018.dockercon.com/videos-hub?watch=dynamic-authorization-and-policy-control-for-docker-containe&#34;&gt;Dynamic Authorization and Policy Control for Docker Container Environments&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;kubecon-cloud-native-us&#34;&gt;Kubecon Cloud Native US&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;with Justin Cappos &lt;a href=&#34;https://www.youtube.com/watch?v=76S7ZAwM0h4&#34;&gt;Intro to TUF/Notary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.youtube.com/watch?v=OZJkwvAnLb4&#34;&gt;How to choose a Kubernetes runtime&lt;/a&gt; had fun giving this talk.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;upcoming&#34;&gt;Upcoming&lt;/h2&gt;

&lt;p&gt;Don&amp;rsquo;t miss the Modern Operating Systems track at &lt;a href=&#34;https://qconlondon.com/&#34;&gt;QCon London&lt;/a&gt; which I am curating, should be excellent.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Jessie Frazelle on eBPF&lt;/li&gt;
&lt;li&gt;Avi Deitcher on LinuxKit&lt;/li&gt;
&lt;li&gt;Kenton Varda on Cloudflare Workers&lt;/li&gt;
&lt;li&gt;others TBC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I am planning or hoping to attend in 2019 at least the events below, but also no dount several other ones.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://rwc.iacr.org/2019/&#34;&gt;Real World Crypto&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://monkigras.com/&#34;&gt;Monkigras&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fosdem.org/2019/&#34;&gt;FOSDEM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://events.linuxfoundation.org/events/kubecon-cloudnativecon-europe-2019/&#34;&gt;Kubecon EU&lt;/a&gt; and US&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://goto.docker.com/2019-DockerCon-SF-Pre-Reg.html&#34;&gt;DockerCon&lt;/a&gt; and EU&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://tmt.knect365.com/risc-v-workshop-zurich/&#34;&gt;Risc-V Workshop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://qconlondon.com/&#34;&gt;QCon London&lt;/a&gt; and some other QCons no doubt&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Confused Deputies Strike Back</title>
      <link>https://www.cloudatomiclab.com/confused-deputies/</link>
      <pubDate>Thu, 27 Dec 2018 19:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/confused-deputies/</guid>
      <description>&lt;p&gt;A few weeks back Kubernetes had its first really severe security issue,
&lt;a href=&#34;https://github.com/kubernetes/kubernetes/issues/71411&#34;&gt;CVE-2018-1002105&lt;/a&gt;.
For some background on this, and how it was discovered, I recommend &lt;a href=&#34;https://rancher.com/blog/2018/2018-12-04-k8s-cve/&#34;&gt;Darren
Shepherd&amp;rsquo;s blog post&lt;/a&gt;, he
discovered it via some side effects and initially it did not appear to be a
security issue just an error handling issue. Of course we know well that many
error handling issues can be escalated, but why was this one so bad?&lt;/p&gt;

&lt;p&gt;To summarize the problem, there is an API server proxy component, that clients
can use to talk to other API endpoints. As the &lt;a href=&#34;https://github.com/kubernetes/kubernetes/files/2700818/PM-CVE-2018-100
2105.pdf&#34;&gt;postmortem
document&lt;/a&gt; says&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Kubernetes API server proxy components still use http/1.1 upgrade-based
connection tunneling, which does not distinguish between request data sent by
the apiserver while establishing the backend connection, and data sent by the
requesting user&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;High and low-privilege API requests to aggregated API servers are proxied via
the same component with the same high-permission transport credentials&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Well, this security issue is actually well known enough to have its own name,
it is the &lt;a href=&#34;https://en.wikipedia.org/wiki/Confused_deputy_problem&#34;&gt;confused deputy
problem&lt;/a&gt;, originally
written about by &lt;a href=&#34;http://zoo.cs.yale.edu/classes/cs422/2010/bib/hardy88confused.pdf&#34;&gt;Norm Hardy in
1988&lt;/a&gt;
although referring to an original example from the 1970s. The essence of the
problem is that there are three parties involved, a user, a proxy or deputy
type component and an object or service that needs to be accessed, or a similar
set of endpoints. The user connects to the deputy to perform an action on an
object, but the deputy can be persuaded to act on an object that it has access
to rather than one the end user has access to.&lt;/p&gt;

&lt;p&gt;Imagine asking your accountant to fill in your tax return. Your accountant has
access to your tax return, but also to those of other customers. If the
accountant is buggy or can be confused she could fill in one of these tax
returns instead of yours. The general problem is that in order to run a tax
return filing service, you need the ability to fill in lots of different
people&amp;rsquo;s tax returns. You become a very privileged node, a superuser of tax
returns. The tax office has to respect your authority to fill in lots of tax
returns, and read them, so the accountant&amp;rsquo;s credentials must be very
privileged. We see similar designs in all sorts of places, like suid
applications in Unix that can do operations on behalf of any user and must be
very highly trusted, and are often the source of security bugs.&lt;/p&gt;

&lt;p&gt;What is the solution? Well we can not have these deputies. Fill in your own tax
return! But in effect this says do not use microservices. If every endpoint
needs to have the code for filling in tax returns we lose the benefit of
microservices, we have to update lots of endpoints together, we cannot have a
team building better accountant services and so on. What we really want is that
the accountant does not have to be a superuser, but instead she has no
permissions on her own but we can pass credentials (maybe time limited) to
update our tax return (but not to generally impersonate us) with our request.
This access control model is called &lt;a href=&#34;https://en.wikipedia.org/wiki/Capability-based_security&#34;&gt;capability-based
security&lt;/a&gt;: access is
granted via unforgeable but transferable tokens that provide access to objects.
You can imagine they are keys, like passing your car key to a valet service,
rather than the valet service having a master key for all cars that they might
need to park.&lt;/p&gt;

&lt;p&gt;The standard access control list (ACL) models of authorization are all about
making decisions based on identity, a concept that clearly must not be
transferable. I never want my accountant to have to (or be able to) pretend to
be me to fill in my tax return. The classic solution in this case would be for
me to be able to add additional people to the ACL for my tax return; this is
modeled in new ACL frameworks like NGAC from NIST (sorry no link right now the
website is down due to the government shutdown). This does not immediately seem
applicable to the Kubernetes issue though, and is much more complex than
passing my API access credential to the API proxy server. At this point I
highly recommend the excellent short paper &lt;a href=&#34;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.406.4684&amp;amp;rep=rep
1&amp;amp;type=pdf&#34;&gt;ACLs don’t by Tyler
Close&lt;/a&gt;, one of my favourite papers (I should do a papers we love session
on it). His examples mainly come from the browser, another prevalent deputy
with a lot of security issues, such as CSRF another confused deputy attack.
Capabilities are actually very simple to understand and reason about.&lt;/p&gt;

&lt;p&gt;ACL based security is fine for many situations, in particular where there are
only two parties and you just want to mediate access to a set of resources. But
microservices do not appear to be in that sweet spot, as Kubernetes found out
with its API proxy microservice. Bugs can be fixed, but as the retrospective
points out all changes will need to be examined for security issues. As Tyler
Close says &amp;ldquo;the correct implementation of an access policy cannot be
ascertained by an examination of the ACLs configured for an application, but
must also include an examination of the program’s source code. To date, this
technique has been error prone.&amp;rdquo; It was not even the only bug that week that
was a confused deputy issue, the &lt;a href=&#34;https://medium.com/tenable-techblog/remotely-exploiting-zoom-meetings-5a811
342ba1d&#34;&gt;Zoom critical
bug&lt;/a&gt; was the same issue, where UDP packets could confuse the deputy
service. These are critical issues happening on a regular basis, and no doubt
many more lurk.&lt;/p&gt;

&lt;p&gt;The entire reason for microservices is to have third parties to delegate
services to, and we need to shift away from ACL based models to capabilities
for microservices. Of course this is non trivial, distributed capabilities (as
opposed to local ones) have not been used much and we don&amp;rsquo;t have a good
infrastructure for them yet. I will write more about practicalities in a
further post, but we need to start shifting security to be microservice native
too not just adopting things that worked for monoliths.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>QUIC for Unikernels</title>
      <link>https://www.cloudatomiclab.com/quic-unikernels/</link>
      <pubDate>Wed, 26 Dec 2018 20:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/quic-unikernels/</guid>
      <description>&lt;p&gt;I had until recently mostly been ignoring QUIC. In case you had, QUIC is a
new-ish protocol developed to Google, that will probably be HTTP version 3. The
interesting pieces are that it runs over UDP not TCP, but supports reliable
delivery by implementing retransmission itself, that it supports multiple
streams without head of line blocking, and that it is designed to support
encryption natively. Another important benefit is that connections can migrate
from one IP to another without being dropped, as there is a connection ID
independent of source and destination addresses. It is also designed to not
ossify, with as much as possible of the packet encrypted as possible, so that
intermediaries cannot inspect what is inside the packet and make decisions, to
avoid the great difficulty in changing TCP where adding new extensions has a
small chance of working as middle boxes will often strip things they do not
understand, or not let the packets through. If all you can see is UDP flows it
is harder to do much. Another benefit is faster handshake for the encrypted
case. Proposed extensions include different algorithms for congestion control,
for example for different environments like in datacentre or high latency
connections, and forward error correction.&lt;/p&gt;

&lt;p&gt;I had partly been ignoring QUIC as it has not yet been finalised, and had some
temporary encryption included that was going to be removed to be replaced by
TLS 1.3, and also as it seemed to be very tied up with HTTP/2. I also had some
idea that layering encryption over TCP in possibly slightly non spec compliant
ways might make sense. But then a few weeks back, a paper on &lt;a href=&#34;https://dl.acm.org/citation.cfm?id=3284854&#34;&gt;nQUIC or Noise on
QUIC&lt;/a&gt; came out, and I decided to
take another look. It turned out that there were other people interested in
removing the strong tie to TLS, and also looking further it seems that the
protocol is not that tied to HTTP, and it does provide a general transport with
multiple streams. The &lt;a href=&#34;https://datatracker.ietf.org/doc/draft-ietf-quic-transport/&#34;&gt;IETF standard
drafts&lt;/a&gt; split out
the TLS implementation, and it looks like there is interest in pushing for a
standard Noise based version. Quic is not significantly more complex than TCP,
especially as you can in effect hard code the number of streams if you do not
want to use that feature, for example on an embedded system. Noise over QUIC,
without HTTP looks pretty reasonable for small systems that have enough
performance to do encryption and a little memory, even down to
microcontrollers. You could even customise it for some applications in closed
environments.&lt;/p&gt;

&lt;p&gt;So what has this got to do with unikernels? Well the interesting thing about
QUIC is that it always runs in userspace, not in the kernel on conventional
systems. So that puts unikernels on an equal footing, they can use the same
implementations as other applications use. There are &lt;a href=&#34;https://github.com/quicwg/base-drafts/wiki/Implementations&#34;&gt;already
implementations&lt;/a&gt; in
C, C++, Go, Rust, TypeScript, Objective C, Python, and no doubt more.
Interfacing QUIC to a transport stack is pretty simple as UDP is just a thin
layer over ethernet. There is no reason why the implementation should be any
less efficient, indeed it can probably be made more efficient as it csn bypass
several abstraction layers.&lt;/p&gt;

&lt;p&gt;There are some potential issues in that some firewalls block QUIC (which is
typically on UDP port 443); browsers will switch to TCP in that case. A QUIC
only unikernel might not have that luxury, especially in some embedded
situation. Larger machines can still fall back to TCP, but that can be a less
optimised version. The main use case for QUIC would initially be for traffic
between dedicated unikernel or embedded services, especially if you are using
Noise rather than TLS for a very small implementation, not public endpoints.
There are &lt;a href=&#34;https://calendar.perfplanet.com/2018/quic-and-http-3-too-big-to-fail/&#34;&gt;some
concerns&lt;/a&gt;
 that the CPU overhead of QUIC is higher, so it may not be suitable for
embedded applications, and there are no benefits over TCP for those cases. But
there is freedom to iterate in a way that there is much less with TCP, so I
think it is definitely worth examining. Research in whether CPU overhead is a
necessary part of the protocol, and how to measure efficiency in different
environments is also productive.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Distributed Capabilities via Cryptography</title>
      <link>https://www.cloudatomiclab.com/crypto-capabilities/</link>
      <pubDate>Sat, 22 Sep 2018 12:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/crypto-capabilities/</guid>
      <description>

&lt;p&gt;&lt;img src=&#34;./hammer-robot.jpg&#34; width=&#34;80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;This is a follow up to &lt;a href=&#34;https://www.cloudatomiclab.com/noise-capabilities/&#34;&gt;my previous capabilities
post&lt;/a&gt;. As before, you
probably want to read &lt;a href=&#34;http://srl.cs.jhu.edu/pubs/SRL2003-02.pdf&#34;&gt;Capability Myths
Demolished&lt;/a&gt; and the &lt;a href=&#34;https://noiseprotocol.org/noise.html&#34;&gt;Noise Protocol
specification&lt;/a&gt; first for full value
extraction. This is a pretty rough draft, I was going to rewrite it but decided
just to publish as is, work in progress, and write another post later, having
left it for several weeks after writing it. This stuff needs a much clearer
explanation.&lt;/p&gt;

&lt;p&gt;I went to a &lt;a href=&#34;https://protocol.ai&#34;&gt;Protocol Labs&lt;/a&gt; dinner last night (thanks for
the invite Mike) and managed to corner Brian Warner from
&lt;a href=&#34;https://agoric.com/&#34;&gt;Agoric&lt;/a&gt; and ask about cryptographic distributed
capabilities, which was quite helpful. This stuff has not really been written
down, so here is an attempt to do so. I should probably add references at some
point.&lt;/p&gt;

&lt;h2 id=&#34;non-cryptographic-capabilities&#34;&gt;Non cryptographic capabilities&lt;/h2&gt;

&lt;p&gt;For reference and completeness, let us cover how you transmit capabilities
without any cryptography, and what the downsides are. The basic model is called
the &amp;ldquo;Swiss number&amp;rdquo;, after a (I suspect somewhat mythical) model of an anonymous
Swiss bank account, where you just present the account number, which you must
keep secret, in order to deposit and withdraw money, no questions asked. This
is pretty much the standard model in the historic literature, largely written
before public key cryptography was feasible to use. In modern terms, the Swiss
number capability should be a random 256 bit number, and the connection should
of course be encrypted to prevent snooping. The implementation is easy, just
check (in constant time!) that the presented number is equal to the object&amp;rsquo;s
number. Minting these is trivial. The capability is a random bearer token.&lt;/p&gt;

&lt;p&gt;The downsides are pretty clear. First is that you may present the capability to
the wrong party for checking. Checking and transfer of capabilities are very
different operations, and we would like that checking did not reveal the token.
This is a general problem with bearer tokens, such as JWT, they can easily be
presented to the wrong party, or to a man in the middle attacker. We would like
cryptographic protection for the check operation to avoid this. The second
downside, which is somewhat related, is that we have no idea how to identify
the intended object. Any party who has a copy of the capability can pretend to
be the object it refers to, as there is no asymmetry between parties. We have
to rely on some external naming system, that might be subverted. The third
issue is that we have to build our own encryption, and the token we have does
not help, as it does not act as a key or help identify the other party. So we
have to rely on anonymous key exchange, which is subjet to man in the middle
attacks as we do not know an identity for the other participant, or again some
sort of external source of truth, such as the PKI system.&lt;/p&gt;

&lt;p&gt;These downsides are pretty critical for modern secure software, so we need to
do better. We will refer to these three properties, check does not reveal,
object identifier, and encryption included to analyze some alternatives.&lt;/p&gt;

&lt;p&gt;There are some things I am not going to discuss in this post. I mentioned the
model of secret public keys, which appears in some of the literature, in an
earlier post, but will ignore it here as it has security issues. I am not going
to cover macaroons either; they are another form of bearer token with
differently interesting properties.&lt;/p&gt;

&lt;h2 id=&#34;cryptographic-capabilities&#34;&gt;Cryptographic Capabilities&lt;/h2&gt;

&lt;p&gt;The obvious way to solve the second problem, of being able to identify the
object that the capability refers to securely, is to give the object an
asymmetric key. We can then hand out the public key, which can uusefully be the
object identifier and be used to locate it, while the object keeps its private
key secure, and does not hand it out to any other object (it can be kept in a
TPM type device as it is only needed for restricted operations).
We can now set up an encrypted channel with this object, and as we know the
public key up front, we can be sure we have connected to the right object if we
validate this correctly. In Noise Protocol terms, we can use an NK handshake
pattern, where the connecting object is anonymous but it knows the public key
it is connecting to. We can also use XK (or IK) if we want to pass the object
identity of the connecting object, for example for audit purposes. Once we have
connected, we can use the Swissnum model to demonstrate we have the capability,
but without the risk of passing the capability to the wrong party.&lt;/p&gt;

&lt;p&gt;However, we can improve this, by using the Swissnum as a symmetric key, and
incorporating it as a secret known by both parties into the asymmetric
handshake. In Noise Protocol terms this is the NKpsk0 handshake (or XKpsk0)
that I mentioned in my previous post. The handshake will only succeed if both
parties have the same key, as the key is securely mixed into the shared
symmetric key that is generated from the Diffie-Hellman exchange of the public
keys. This is even better than the Swissnum method above, as the handshake is
shorter as you do not need the extra phase to pass and potentially acknowledge
the Swissnum; it looks pretty similar as a symmetric key is generally just an
arbitrary random sequence of 256 bits or so anyway.&lt;/p&gt;

&lt;p&gt;This model does solve all our three issues, as a handshake to the wrong party
does not reveal the capability, the object cannot be spoofed by another
(without stealing the private key) and the keys support and encrypted channel.
It is not the only mechanism however. Minting new capabilities is easy, you
just create a new symmetric key, and creating objects is easy, create an
asymmetric keypair.&lt;/p&gt;

&lt;p&gt;Instead of using an asymmetric key and a symmetric key, Brian Warner pointed
out to me yesterday that we can present a certificate instead of the symmetric
key. This is slightly more complex. To demonstrate possession of a capability,
we will present a certificate to the object. We have to sign an ephemeral that
the object presents us, and the simplest method is if the object that the
capability is for has the public key to check the signature, and the capability
is the private signing key. Anyone with the capability can directly sign the
certificate, and you pass the private key around to transfer the capability.
Note that the subject of the capability does not need to know the private
signing key, so it cannot necessarily pass on a capability to access itself.
This might an advantage in some circumstances. Note that the holders of the
capability need to transfer a private key to pass the capability on, so they
cannot hold the key in a TPM device that does not allow key export, or indeed a
general cryptographic API that only supports a private key type that has
signing operations but not an export operation, which has been common practise.
Note that the Noise Protocol Framework support for signatures is a work in
progress, scheduled for a revision later this year.&lt;/p&gt;

&lt;p&gt;If you don&amp;rsquo;t want to pass around private keys, you could use a chained
signature model, where each party that passes on a capability adds to a
signature chain, authenticating the public key of the next party, all chaining
down to the original key. This would mean unbounded lengths of chain though,
that would be a problem for many use cases. It would provider an audit trail of
how each party got the capability, but transparency logs probably do this more
effectively.&lt;/p&gt;

&lt;p&gt;Thinking about this model, actually we do not need to use signatures, we can
just use encryption keys directly. The same as before the object the capability
is granted over has a private encryption key, but instead of using signatures,
we can create an asymmetric encryption keypair, and give the object the public
key, while capability holders get the private key, and pass the private key
around as the capability. So to validate an encryption handshake, the object
will check that the capability holder has the correct private key, while the
capability holder will validate it is talking to the object that possesses the
identiy private key. In Noise protocol terms this is a KK handshake, where both
parties know the public key for the other party, and verify that each possesses
the private key. The signature version is a KK variant with one signature
substituted for anencryption key, and there is another variant where both keys
are replaced by signatures, the Noise signature protocol modifiers allow
sigantures to substitute for longer term with ephemeral Diffie-Hellman key
agreement in any combination, with some deferral modifictions.&lt;/p&gt;

&lt;p&gt;So we see that rather than using the mixed symmetric and asymmetric key model
(NKpsk) that I discussed before, we can use symmetric key only (KK) models for
distributed capabilities. The differences for the user are relatively small, as
both methods fulfil our three criteria, we just have the difference that the
object need not be able to pass on capabilities to itself in the public key
only model, and the fact that we have to pass around asymmetric private keys,
which there is a reluctance to do sometimes. For quantum resistence, it is
possible to use a combination of both symmetric and asymmetric keys, sharing
a symmetric key among all parties.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Open Source and Cloud: who eats who?</title>
      <link>https://www.cloudatomiclab.com/open-source-cloud/</link>
      <pubDate>Mon, 27 Aug 2018 16:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/open-source-cloud/</guid>
      <description>

&lt;p&gt;&lt;img src=&#34;./Eduardo-Paolozzi-Cloud-Atomic-Laboratory-Chimpanzee-in-Test-Box-Designed-For-Space-Flight.jpg&#34; width=&#34;80%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;Having been on holiday there has been a bit of an outburst of discussion around new licenses for open source, or less open source code. But in many ways the more interesting piece, that puts it in context was Joseph Jacks&amp;rsquo; seventh Open Consensus piece, &lt;a href=&#34;https://medium.com/open-consensus/7-oss-will-eat-all-of-cloud-computing-40fdbf9c1da0&#34;&gt;&amp;ldquo;OSS Will Eat Cloud Computing&amp;rdquo;&lt;/a&gt;. The Redis arguments about changing some code from the anti cloud provider AGPL to a no commercial exploitation license were only aimed at the cloud providers making open source code available as a service while often not contributing back significantly, and not providing any revenue to upstream open source companies.&lt;/p&gt;

&lt;p&gt;But if OSS is going to eat up cloud, and the commercial open source companies are also going to be increasingly succesful, then this starts to seem like just a teething issue, soon to be forgotten. So let us look what the optimistic case looks like. First though I think it helps to look at the open source strategies of the cloud providers themselves, to give some context to the current position.&lt;/p&gt;

&lt;h2 id=&#34;open-source-and-the-cloud-providers&#34;&gt;Open source and the cloud providers&lt;/h2&gt;

&lt;p&gt;AWS does not produce a lot of open source software of any great interest itself. It has started to show some interest in strategic moves, such as backing the nodeless Kubernetes work but overall, nothing of interest. Plenty of open source to help you use its services of course. On the other hand it is the main co-opter that Redis Labs complains about, shipping often slightly old and modified versions of open source software. There are many examples, from the widely used MySQL service to many other projects, such as Redis sold as &amp;ldquo;Amazon ElasticCache for Redis&amp;rdquo;, which is &amp;ldquo;Redis compatible&amp;rdquo; but does not confirm or deny being upstream Redis. Another example is Amazon ECS, which bundled open source Docker with an orchestration layer quite different from the way that the open source community was going, as a proprietary way to run Docker that has ultimately been unsuccesful.&lt;/p&gt;

&lt;p&gt;Azure produces the largest quantity of open source of any of the three major US based cloud providers, with much of the Azure infrastructure actually being open source, including Service Fabric, the entirety of the Function as a service platform and so on. However the sheer quantity of open source coming out of Microsoft means that it needs curation and most of it has not created a deep community. They urgently need to prioritise a smaller number of core projects, get community contribution or shut them down if this fails, and get them into the CNCF to counteract the Google centric nature of the projects there. With careful management Azure could be very well positioned for an OSS eats the cloud future but it needs the community behind it, or failing that to follow the community instead.&lt;/p&gt;

&lt;p&gt;Google is the smallest of the cloud players but has so far used open source the most strategically, and has shown that a large investment, such as that in Kubernetes, can push developer mindshare in the direction you want, and gain mass adoption. This has been combined with other succesful projects, such as Go, which has been one of the most succesful programming language launches of recent times. However it is not clear to me that this success can necessarily be replicated on an ongoing basis, and there are opportunities for other players to disrupt the strategic play. First Google demands extreme control over its projects, and many of its &amp;ldquo;open source&amp;rdquo; projects are just code over the wall plays, with no interest in real community involvement. Others offer community involvement very late once the direction is already strongly defined, as is clear in the decision to not donate Istio to the CNCF until well post the 1.0 release. There is a whole strategic roadmap being mapped out, pushing the Google way of doing things, and I predict that much of it will not stick in the end. Not every project is going to be in the right place at the right time that Kubernetes was. Another issue is the suite of &amp;ldquo;GIFFE&amp;rdquo; (Google infrastructure for everyone else) companies that ex-Googlers like to start up and Google Ventures likes to fund, which further spread the Google way, have a problem. The main issue is that Google already has an internal project that matches these, so in many cases they do not have any interest in actually buying a company with an open source reimplementation. So there is no real process for an exit to Google, unlike the classic Cisco spin out and then purchase model for startups where the viable companies get bought up by the original company again. The biggest exit in this space has been CoreOS that was purchased to remove a competitor in the Kubernetes market; the Linux distribution it started with added no value to the transaction.&lt;/p&gt;

&lt;p&gt;The other impact of all three cloud providers that is important is the hiring. Many engineers who might otherwise be working on other open source projects are being hired by the cloud providers to work on their projects, which are largely not open source. The rapidly growing revenues and high margins mean that hiring by the three is very significant, and both Amazon and Microsoft have profits and stock prices (and hence equity compensation) that are now largely being driven by the growth in cloud. Google is still largely an advertising company, and Google Cloud is very small compared to the other two, so there is less of a direct multiplier there. This adds to pressure on salaries in the rest of the industry and shifts people to working on the cloud providers open and closed source strategic directions.&lt;/p&gt;

&lt;h2 id=&#34;what-the-near-term-would-look-like&#34;&gt;What the near term would look like&lt;/h2&gt;

&lt;p&gt;If open source is to eat cloud, cloud has to become the commodity layer. We have an similar recent model in the mobile phone providers in the 1990s. Suddenly there was a huge opportunity for telecom companies who were in a mature low margin business, plus upstart businesses to enter a high growth high margin business. Huge profits were made, new global giant companies such as Vodafone were created, and well in the end mobile became just a commodity
business to run the (open source driven) internet over. Margins continue to fall, no new value was captured by the network owners despite many attempts. The details of how this failed are not that relevant perhaps; the important thing is that trillions of dollars of value capture that was hoped for, even expected, did not in the end materialize. The key is the &amp;ldquo;dumb pipes&amp;rdquo; phrase that telcos worried about so much.&lt;/p&gt;

&lt;h2 id=&#34;dumb-cloud-pipes&#34;&gt;Dumb cloud pipes&lt;/h2&gt;

&lt;p&gt;The route to dump cloud pipes involves a fair number of forces converging. First the explosive growth in cloud as a whole, as happened in mobile, removes much price pressure, while there is an explosion of services (especially at AWS that is constantly launching them). With the huge demand, there is initially little price sensitivity, and there is an exploration of services while people discover which are most useful. Pricing is opaque and users do not initially realise exactly what they are going to consume or how much value it has. This is our current phase, with cloud provider margins being very high. Counteracting this comes customer maturity, such as better budget control, standardised feature sets, and better price negotiation. Prices could easily start to fall again, at a rate of 20%-30% a year or higher for long periods. The clouds will try to build moats and lock in at this point, building on the current points of lock in. These especially include IAM, where models and APIs differ substantially between providers, hence the build out of deeper security related services such as cloud HSM and other areas that deepen use of these APIs. Data gravity is another moat, and some people have suggested that data storage might end up being subsidised until it is free, anything to get more data in; transit costs dominate for many use cases anyway, and highly discourage cross cloud use cases. Cloud provider network costs are deliberately high.&lt;/p&gt;

&lt;p&gt;In general, like the old saying about the internet, that it sees censorship as something to route around, open source tends to see lock in and moats as something to fill in. We already have a situation where the S3 API (the oldest part of AWS) is the standard for all providers, and has open source implementations such as Minio. Managed Kubernetes is another area where all the providers are being forced by the market to provide a standard interface. Pure compute is not so standardised but is undifferentiated. The next thing we see coming are higher level interfaces over whole classes of API; one example of the type of approach is Pulumi that provides a very different, programming language focused rather than API focused, but designed to work across arbitrary clouds without caring. Note that some of the Google open source efforts promote these type of changes, in order to try to make their cloud more interchangeable with AWS in particular, but they also have a large amount of proprietary technology that they are using to attempt moat building at the same time.&lt;/p&gt;

&lt;h2 id=&#34;community-of-purpose&#34;&gt;Community of purpose&lt;/h2&gt;

&lt;p&gt;There are some open source companies already working in this space, including my employer Docker and several other large scale companies, as well as the wealth and diversity of smaller, growing companies that make up the current community. As Joseph points out in his post, these commerical open source companies are growing very rapidly but this is being largely ignored as cloud is more obvious. There is plenty more room of course, and as customers gradually realise that the cloud provision is a dumb pipe and the open source software they run on top is where the actual value is they will want to get it from the real providers, and engage directly with the communities to contribute and pay for changes and support.&lt;/p&gt;

&lt;p&gt;Ultimately it is the end customers who swing the situation, realise that pipes and hardware are just utility, and the people there they like we have seen elsewhere continue to move towards and engage with open communities, open source communities, and demand that their organizations do fully engage too. So far we have seen that every enterprise has engaged in the consumption of open source software, but its value is still only tangentially embedded in the majority of organisations. A decade ago we used to sell open source software because people would decide they would have one open source vendor in a pitch, and we would win it. Soon there will be whole areas of software, especially infrastructure, where closed source, including closed source cloud services just won&amp;rsquo;t be viable&lt;/p&gt;

&lt;p&gt;Countering the rosy view that the experience of open source as a better development model will inevitably lead to growth in understanding and use of open source, what if people just like free or cheap and convenient, and cloud delivered proprietary services are good enough? Essentially though that is just an argument that cloud providers are the best at producing convenient interfaces; historically that has been true as it is their business, but it is not an exclusive ability, just one that needs working on. As &lt;a href=&#34;https://twitter.com/sophaskins/status/1033098661599698945&#34;&gt;Sophie Haskins points out&lt;/a&gt;, open source companies have often undervalued the work on actually making their code deployable and maintainable in production, which the cloud providers have done instead, in a closed way instead. Taking back ownership of this clearly will help.&lt;/p&gt;

&lt;p&gt;Overall the question is will open communities simply fold over in the path of cloud provision, or will they route around blockages to open innovation and co-opt the infrastructure for new purposes and tools. It is hard not to be optimistic given the current rate of innovation.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>From Filesystems to CRUD and Beyond</title>
      <link>https://www.cloudatomiclab.com/post-crud/</link>
      <pubDate>Sun, 08 Jul 2018 22:21:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/post-crud/</guid>
      <description>

&lt;h1 id=&#34;from-filesystems-to-crud-and-beyond&#34;&gt;From Filesystems to CRUD and Beyond&lt;/h1&gt;

&lt;p&gt;In the beginning there was the filesystem. Well, in 1965 to be precise, when
&lt;a href=&#34;http://www.multicians.org/fjcc4.html&#34;&gt;Multics introduced the first
filesystem&lt;/a&gt;. The Multics filesystem was
then implemented in Unix, and is pretty recognisable with a few tweaks with
what we use as filesystems now. Files are byte addressed, with &lt;code&gt;read&lt;/code&gt;, &lt;code&gt;write&lt;/code&gt;
and &lt;code&gt;seek&lt;/code&gt; operations, and there are read, write and execute permissions by
user. Multics had permissions via a list of users rather than groups, but the
basic structure is similar.&lt;/p&gt;

&lt;p&gt;We have got pretty used to filesystems, and they are very convenient, but they
are problematic for distributed and high performace systems. In particular, as
standardised by Posix, there are severe performance problems. A &lt;code&gt;write()&lt;/code&gt; is
required to write data before it returns, even in a situation where multiple
writers are writing to the same file; the write should then be visible to all
other readers. This is relatively easy to organize on a single machine, but it
requires a lot of synchronisation on a cluster. In addition there are lots of
metadata operations, around access control, timestamps that are technically
required, although access time is often disabled, and directory operations are
complex.&lt;/p&gt;

&lt;p&gt;The first well known alternative to the Posix filesystem was probably Amazon
S3, launched in 2006. This removed a lot of the problematic Posix constraints.
First there are no directories, although there is a prefix based listing that
can be used to give an impression of directories. Second, files can only be
updated atomically as a whole. This makes it essentially a key value store with
a listing function, and events. Later optional versioning was added too, so
previous versions of a value could be retrieved. Access control is a rather
complex extended version of per user ACLs, read, write and ability to make
changes. S3 is the most succesful and largest distributed filesystem ever
created. People rarely complain that it is not Posix compatible; atomic file
update actually seems to capture most use cases. Perhaps the most common
complaint is inability to append, as people are not used to the model of
treating a set of files as a log rather than an individual appended file. There
are interfaces to treat S3 as a Posix-like filesystem, such as via Fuse,
although they rarely attempt to emulate full semantics and may do a lot of
copying behind the scenes, they can be convenient for some use cases where
users like to have a familiar interface.&lt;/p&gt;

&lt;p&gt;One of the reasons for the match between S3 and programs was that it was
designed around the HTTP verbs: &lt;code&gt;GET&lt;/code&gt;, &lt;code&gt;PUT&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt;. The HTTP resource
model, REST, was &lt;a href=&#34;https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf&#34;&gt;documented by
Fielding&lt;/a&gt;
in 2000, before S3, and &lt;code&gt;PATCH&lt;/code&gt;, which you can argue has slightly more Posixy
semantics, was only added in 2010. S3 is an HTTP service, and feels web native.
This was the same move that led to Ruby on Rails in 2005, and the growth of the
CRUD (create, read, update, delete) application as a design pattern, even if
that was originally typically database backed.&lt;/p&gt;

&lt;p&gt;So, we went from the Multics model to a key value model for scale out, while
keeping access control unchanged? Is this all you need to know about files?
Well actually no, there are two other important shifts, that remain less well
understood.&lt;/p&gt;

&lt;p&gt;The first of these is exemplified by Git, released in 2005, although the model
had been around before that. The core part of git is a content addressed store.
I quite like the term &amp;ldquo;value store&amp;rdquo; for these; essentially they are just like a
key value store only you do not get to choose the key, usually it is some sort
of hash of the content, SHA1 (for now) in Git, SHA256 for Docker images and
most modern versions. Often this is implemented by a key value store, but you
can optimise, as keys are always the same size and more uniformly distributed.
Listing is not a basic operation either. The user interface for content
addressed stores is much simpler, of the CRUD operations, only create and read
are meaningful. Delete is usually a garbage collection operation, and there is
no update. From the distributed system point of view there is no longer any
need to deal with update races, ETags and so on, removing a lot of complexity.
The content provides the key so there are no clashes. Many applications will
use a small key valuestore as well, such as for naming tags, but this is a very
small part of the overall system.&lt;/p&gt;

&lt;p&gt;Sadly, content addressible systems are not as common as they should be. Docker
image layers were originally not content addressed, but this was switched for
security reasons a long time ago. There are libraries such as
&lt;a href=&#34;https://github.com/mirage/irmin&#34;&gt;Irmin&lt;/a&gt; that present a git model for data
structures,
but CRUD based models still dominate developer mindshare. This is despite the
advantages for things like caches, where content addressed data can have an
infinite cache lifetime, and the greater ease of distribution. It is now
possible to build a content addressed system on S3 based on SHA256 hashes, as
signed URLs, now that S3 supports an &lt;code&gt;x-amz-content-sha256&lt;/code&gt; header, see &lt;a href=&#34;https://gist.github.com/justincormack/496500a5bb1b0dd31f12f49b81ff931a&#34;&gt;this
gist for example
code&lt;/a&gt;.
The other cloud providers, and Minio, currently still only support MD5 based
content hashes, or CRC32c in the case of Google, that are of no use at all.
Hopefully they will update this to a modern useful content hash soon. I would
highly recommend looking at whether you can build the largest part of systems
on a content addressed store.&lt;/p&gt;

&lt;p&gt;The second big change is even less common so far, but it starts to follow on
from the first. Access control via ACL is complicated and easy to make mistakes
with. With content addressed storage, in situations where access control is not
uniform, such as private images on Docker hub, access control is complicated.
Ownership is also complicated as many people could have access to some pieces
of content. The effective solution here is to encrypt content and use key
management for access control. Encryption as an access control method has
upsides and downsides. It simplifies the common read path, as no access control
is needed on the read side at all. On the write side, with content addressing,
you just need a minimal level of access control to stop spam. On the downside
there is key management to deal with, and a possible performance hit. Note the
cloud providers provide server side encryption APIs, so they will encrypt the
contents of your S3 buckets with keys that they have access to and which you
can use IAM to delegate, but this is somewhat pointless, as you still have
exactly the same IAM access controls, and no end to end encryption; it is
mainly a checkbox for people who think it fixes regulatory requirements.&lt;/p&gt;

&lt;p&gt;So in summary, don&amp;rsquo;t use filesystems for large distributed systems, keep them
local to one machine. See if you can design your systems based on content
addressing, which scales best, and failing that use a key value store. User ACL
based access control is complicated to manage to scale, although cloud
providers like it as it gives them lock in. Consider encrypting data that needs
to be private as an access control model instead.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Making Immutable Infrastructure simpler with LinuxKit</title>
      <link>https://www.cloudatomiclab.com/immutable-gent/</link>
      <pubDate>Tue, 06 Feb 2018 20:40:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/immutable-gent/</guid>
      <description>

&lt;h2 id=&#34;config-management-camp&#34;&gt;Config Management Camp&lt;/h2&gt;

&lt;p&gt;I gave this talk at Config Management Camp 2018, in Gent. This is a great event, and I recommend you go if you are interested in systems and how to make them work.&lt;/p&gt;

&lt;p&gt;Did I mention that Gent is a lovely Belgian town?&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;./gent.jpg&#34; width=&#34;100%&#34;/&gt;&lt;/p&gt;

&lt;p&gt;The slides &lt;a href=&#34;./Making Immutable Infrastructure simpler with LinuxKit.pdf&#34;&gt;can be downloaded here&lt;/a&gt;.&lt;/p&gt;

&lt;iframe width=&#34;560&#34; height=&#34;315&#34; src=&#34;https://www.youtube.com/embed/DmcSo1Wts0Q?rel=0&#34; frameborder=&#34;0&#34; allow=&#34;autoplay; encrypted-media&#34; allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;Below is a brief summary.&lt;/p&gt;

&lt;h2 id=&#34;history&#34;&gt;History&lt;/h2&gt;

&lt;p&gt;Some history of the ideas behind immutability.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“The self-modifying behavior of both manual and automatic administration techniques helps explain the difficulty and expense of maintaining high availability and security in conventionally-administered infrastructures.
A concise and reliable way to describe any arbitrary state of a disk is to describe the procedure for creating that state.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Steve Traugott, Why Order Matters: Turing Equivalence in Automated Systems Administration, 2002&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“In the cloud, we know exactly what we want a server to be, and if we want to change that we simply terminate it and launch a new server with a new AMI.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Netflix Building with Legos, 2011&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“As a system administrator, one of the scariest things I ever encounter is a server that’s been running for ages. If you absolutely know a system has been created via automation and never changed since the moment of creation, most of the problems disappear.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Chad Fowler,Trash Your Servers and Burn Your Code, 2013&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Use container-specific OSes instead of general-purpose ones to reduce attack surfaces. When using a container-specific OS, attack surfaces are typically much smaller than they would be with a general-purpose OS, so there are fewer opportunities to attack and compromise a container-specific OS.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href=&#34;http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-190.pdf&#34;&gt;NIST Application Container Security Guide, 2017&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&#34;updates&#34;&gt;Updates&lt;/h2&gt;

&lt;p&gt;Updating software is a hard thing to do. Sometimes you can update a config file and send a &lt;code&gt;SIGHUP&lt;/code&gt;, other times you have to kill the process. Updating a library may mean restarting everything that depends on it. If you want to change the Docker config that means restarting all the containers potentially. Usually only Erlang programs self update correctly. Our tooling has got a domain specific view of how to do all this, but it is difficult. Usually there is some downtime on a single machine. But in a distributed system we always allow for single system downtime, so why be so hardcore about updates? Just restart the machine with a new image, that is the immutable infrastructure idea. Not immutable, just disposable.&lt;/p&gt;

&lt;h2 id=&#34;state&#34;&gt;State&lt;/h2&gt;

&lt;p&gt;Immutability does not mean there is no state. Twelve factor apps are not that interesting. Everything has data. But we have built Unix systems based on state being all kind of mixed up everywhere in the filesystem. We want to try to split between immutable code and mutable application state.&lt;/p&gt;

&lt;p&gt;Functional programming is a useful model. There is state in functional programs, but it is always explicit not implicit. Mutable global state is the thing that functional programming was a reaction against. Control and understand your state mutation.&lt;/p&gt;

&lt;p&gt;Immutability was something that we made people do for containers, well Docker did. LXC said treat containers like VMs, Docker said treat them as immutable. Docker had better usability and somehow we managed to get people to think they couldn&amp;rsquo;t update container state dynamically and to just redeploy. Sometimes people invent tooling to update containers, with Puppet or Chef or whatever, those people are weird.&lt;/p&gt;

&lt;p&gt;The hard problems are about distributed systems. Really hard. We can&amp;rsquo;t even know what the state is. These are the interesting configuration management problems. Focus on these. Make the individual machine as simple as possible, and just think about the distribution issues. Those are really hard. You don&amp;rsquo;t want configuration drift on machines messing up your system, there are plenty of ways to mess up distributed systems anyway.&lt;/p&gt;

&lt;h2 id=&#34;products&#34;&gt;Products&lt;/h2&gt;

&lt;p&gt;Why are there no immutable system products? Actually the sales model does not work well with something that is at build time only, not running on your infrastructure. The billing models for config management products don&amp;rsquo;t really work well. Immutable system tooling is likely to remain open source and community led for now. Cloud vendors may well be selling you products based on immutable infrastructure though.&lt;/p&gt;

&lt;h2 id=&#34;linuxkit&#34;&gt;LinuxKit&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://github.com/linuxkit/linuxkit&#34;&gt;LinuxKit&lt;/a&gt; was originally built for Docker for Mac. We needed a simple embedded, maintainable, invisible Linux host system to run Docker. The first commit message said &amp;ldquo;not required: self update: treated as immutable&amp;rdquo;. This project became LinuxKit, open sourced in 2017. The only kind of related tooling is Packer, but that is much more complex. One of the goals for LinuxKit was that you should be able to build an AWS AMI from your laptop without actually booting a machine. Essentially LinuxKit is a filesystem manipulation tool, based on containers.&lt;/p&gt;

&lt;p&gt;LinuxKit is based on a really simple model, the same as a Kubernetes pod. First a sequential series of containers runs, to set up the system state, then &lt;code&gt;containerd&lt;/code&gt; runs the main services. This config corresponds to the yaml config file, which itself is used to build the filesystem. Additional tooling lets you build any kind of disk format, for EFI or BIOS, such as ISOs, disk images or initramfs. There are development tools to run images on cloud providers and locally, but you can use any tooling, such as Terraform for production workloads.&lt;/p&gt;

&lt;h2 id=&#34;why-are-people-not-using-immutable-infrastructure&#34;&gt;Why are people not using immutable infrastructure?&lt;/h2&gt;

&lt;p&gt;Lack of tooling is one thing. Packer is really the only option other than LinuxKit, and it has a much more complex workflow involving booting a machine to install. This makes a CI pipeline much more complex. There are also nearly immutable distros like Container Linux, but this is very hard to customise compared to LinuxKit.&lt;/p&gt;

&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;

&lt;p&gt;This is a very brief summary of the talk. Please &lt;a href=&#34;https://github.com/linuxkit/linuxkit&#34;&gt;check out LinuxKit&lt;/a&gt; it is an easy, different and fun way to use Linux.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Using the Noise Protocol Framework to design a distributed capability system</title>
      <link>https://www.cloudatomiclab.com/noise-capabilities/</link>
      <pubDate>Sun, 21 Jan 2018 23:00:00 +0000</pubDate>
      
      <guid>https://www.cloudatomiclab.com/noise-capabilities/</guid>
      <description>

&lt;p&gt;&lt;img src=&#34;./static.jpg&#34; width=&#34;100%&#34;/&gt;&lt;/p&gt;

&lt;h2 id=&#34;preamble&#34;&gt;Preamble&lt;/h2&gt;

&lt;p&gt;In order to understand this blog you should know about &lt;a href=&#34;https://en.wikipedia.org/wiki/Capability-based_security&#34;&gt;capability-based
security&lt;/a&gt;. Perhaps
still the best introduction, especially if you are mainly familiar with role
based access control is the &lt;a href=&#34;http://srl.cs.jhu.edu/pubs/SRL2003-02.pdf&#34;&gt;Capability Myths
Demolished&lt;/a&gt; paper.&lt;/p&gt;

&lt;p&gt;You will also need to be familiar with the &lt;a href=&#34;http://noiseprotocol.org/noise.html&#34;&gt;Noise Protocol
Framework&lt;/a&gt;. Noise is a fairly new crypto
meta protocol, somewhat in the tradition of the NaCl Cryptobox: protocols you
can use easily without error. It is used in modern secure applications like
&lt;a href=&#34;https://www.wireguard.com/&#34;&gt;Wireguard&lt;/a&gt;. Before reading the specification this
short (20m) talk from Real World Crypto 2018 by Trevor Perrin, the author, is
an excellent introduction.&lt;/p&gt;

&lt;iframe width=&#34;560&#34; height=&#34;315&#34; 
src=&#34;https://www.youtube.com/embed/3gipxdJ22iM?rel=0&#34; frameborder=&#34;0&#34; 
allow=&#34;autoplay; encrypted-media&#34; allowfullscreen&gt;&lt;/iframe&gt;

&lt;h2 id=&#34;simplicity&#34;&gt;Simplicity&lt;/h2&gt;

&lt;p&gt;Our stacks have becoming increasingly complicated. One of the things I have
been thinking about is protocols for lighter weight interactions. The smaller
services get, and the more we want high performance services, the more the
overhead of protocols designed for large scale monoliths don&amp;rsquo;t perform. We
cannot replace larger scale systems with nanoservices, serverless and Edge
services if they cannot perform. In addition to performance, we need scaleable
security and identity for nanoservices. Currently nanoservices and serverless
are not really competitive in performance with larger monolithic code, which
can serve millions of requests a second. Current serverless stacks hack around
this by persisting containers for minutes at a time to answer a single request.
Edge devices need simpler protocols too; you don&amp;rsquo;t really want GRPC in
microcontrollers. I will write more about this in future.&lt;/p&gt;

&lt;p&gt;Noise is a great framework for simple secure crypto. In particular, we need
understandable guarantees on the properties of the crypto between services. We
also need a workable identity model, which is where capabilities come in.&lt;/p&gt;

&lt;h2 id=&#34;capabilities&#34;&gt;Capabilities&lt;/h2&gt;

&lt;p&gt;Capability systems, and especially not distributed capability systems, are not
terribly widely used at present. Early designs included KeyKos, and the E
language, which has been taken up as the &lt;a href=&#34;https://capnproto.org/rpc.html&#34;&gt;Cap&amp;rsquo;n Proto
RPC&lt;/a&gt; design. Pony is also capability based,
although these are somewhat different &lt;a href=&#34;https://blog.acolyer.org/2016/02/17/deny-capabilities/&#34;&gt;deny
capabilities&lt;/a&gt;. Many
systems include some capability like pieces though; Unix file descriptors are
capabilities for example, which is why file descriptor based Unix APIs are so
useful for security.&lt;/p&gt;

&lt;p&gt;With a large number of small services, we want to give out fine grained
capabilities. With dynamic services, this is much the most flexible way of
identifying and authorizing services. Capabilities are inherently
decentralised, with no CAs or other centralised infrastructure; services can
create and distribute capabilities independently, and decide on their trust
boundaries. Of course you can also use them just for systems you control and
trust too.&lt;/p&gt;

&lt;p&gt;While it has been recognised for quite a while that there is [an equivalence
between public key cryptography and
capabilities(&lt;a href=&#34;http://www.cap-lore.com/CapTheory/Dist/PubKey.html&#34;&gt;http://www.cap-lore.com/CapTheory/Dist/PubKey.html&lt;/a&gt;), this has not
been used much. I think part of the reason is that historically, public key
cryptography was slow, but of course computers are faster now, and encryption
is much more important.&lt;/p&gt;

&lt;p&gt;The correspondance works as follows. In order for Alice to send an encrypted
message to Bob, she must have his public key. Usually, people just publish
public keys so that anyone can send them messages, but if you do not
necessarily do this things get more interesting. Possession of Bob&amp;rsquo;s public key
gives the capability of talking to Bob; without it you cannot construct an
encrypted message that Bob can decode. Actually it is more useful to think of
they keys in this case not as belonging to people but as roles or services.
Having the service public key allows connecting to it; having the private key
lets you serve it. Note you still need to find the service location to connect;
a hash of the public key could be a suitable DNS name.&lt;/p&gt;

&lt;p&gt;On single hosts, capabilities are usually managed by a privileged process, such
as the operating system. This can give out secure references, such as small
integers like file descriptors, or object pointers protected by the type
system. These methods don&amp;rsquo;t really work in a distributed setup, and
capabilities need a representation on the wire. One of the concerns in the
literature is that if a (distributed) capability is just a string of
(unguessable) bits that can be distributed, then it might get distributed
maliciously. There are two aspects to this. First if a malicious agent has a
capability at all, it can use it maliciously, including proxying other
malicious users, if it has network access. So being able to pass the capability
on is no worse. Generally, only pass capabilities to trusted code, ideally code
that is confined by (lack of) capabilities in where it can communicate and does
not have access to other back channels. Don&amp;rsquo;t run untrusted code. In terms of
keys being exfiltrated unintentionally, this is also an issue that we generally
have with private keys; with capabilities all keys become things that,
especially in these times of Sceptre, we have to be very careful with.
Mechanisms that avoid simply passing keys, and pass references instead, seem to
me to be more complicated and likely to have their own security issues.&lt;/p&gt;

&lt;h2 id=&#34;using-noise-to-build-a-capability-framework&#34;&gt;Using Noise to build a capability framework&lt;/h2&gt;

&lt;p&gt;The Noise spec says &amp;ldquo;The XX pattern is the most generically useful, since it
supports mutual authentication and transmission of static public keys.&amp;rdquo; However
we will see that there different options that make sense for our use case. The
XX pattern allows two parties who do not know each other to communicate and
exchange keys. The XX echange still requires some sort of authentication, such
as certificates to see if the two parties should trust each other.&lt;/p&gt;

&lt;p&gt;Note that &lt;a href=&#34;https://moderncrypto.org/mail-archive/noise/2018/001439.html&#34;&gt;Trevor Perrin pointed
out&lt;/a&gt; that just
using a public key is dangerous and using a pre-shared key (psk) in addition is
a better design. So you should use psk+public key as the capability. This
means that accidentally sharing the public key in a handshake is not a
disastrous event.&lt;/p&gt;

&lt;p&gt;When using keys as capabilities though we always know the public key (aka
capability) of the service we want to connect to. In &lt;a href=&#34;http://noiseprotocol.org/noise.html#interactive-patterns&#34;&gt;Noise
spec&lt;/a&gt; notation, that
is all the ones with &lt;code&gt;&amp;lt;- s&lt;/code&gt; in the pre-message pattern. This indicates that
prior to the start of the handshake phase, the responder (service) has sent
their public key to the initiator (directly or indirectly). That is, that the
initiator of the communication posseses the capability required to connect to
the service, in capability speak. So these patterns are the ones that
correspond to capability systems; for the interactive patterns that is &lt;code&gt;NK&lt;/code&gt;,
&lt;code&gt;KK&lt;/code&gt; and &lt;code&gt;XK&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NK&lt;/code&gt; corresponds to a communication where the initiator does not provide any
identification. This is the normal situation for many capability systems; once
you have the capability you perform an action. If the capability is public, or
widely distributed, this corresponds more or less to a public web API, although
with encryption.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;XK&lt;/code&gt;, and &lt;code&gt;IK&lt;/code&gt; are the equivalent in web terms of providing a (validated) login along with
the connection. The initiator passes a public key (which could be a capability,
or just used as a key) during the handshake. If you want to store some data
which is attached to the identity you use as the passed public key, this
handshake makes sense. Note that the initiator can create any number of public
keys, so the key is not a unique identifier, just one chosen identity. &lt;code&gt;IK&lt;/code&gt; is
the same semantics but has a different, shorter, handshake with slightly different
security properties; it is the one used by Wireguard.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;KK&lt;/code&gt; is an unusual handshake in traditional capability terms; it requires that
both parties know in advance each other&amp;rsquo;s public key, ie that there is in a
sense a mutual capability arrangement, or rendezvous. You could just connect
with &lt;code&gt;XK&lt;/code&gt; and then check the key, but having this in the handshake may make
sense. An &lt;code&gt;XK&lt;/code&gt; handshake could be a precursor to a &lt;code&gt;KK&lt;/code&gt; relationship in future.&lt;/p&gt;

&lt;p&gt;In addition to the more common two way handshakes, noise supports
unidirectional one way messages. It is not common to use public key encryption
for offline messages, such as encrypting a file or database record at present.
Usually symmetric keys are used. The noise one way methods use public keys, and
all three &lt;code&gt;N&lt;/code&gt;, &lt;code&gt;X&lt;/code&gt; and &lt;code&gt;K&lt;/code&gt; require the recipient public key (otherwise they
would not be confidential), so they all correspond to capabilities based
exchanges. Just like the interactive patterns, they can be anonymous or pass or
require keys. There are disadvantages to these patterns, as there is no replay
protection, as the receiver cannot provide an ephemeral key, but for offline
uses, such as store and forward, or file or database encryption. Unlike
symmetric keys for this use case, there is a seperate sender and receiver role,
so the ability to read database records does not mean the ability to forge
them, improving security. It also fits much better in a capabilities world, and
is simpler as there is only one type of key, rather than having two types and
complex key management.&lt;/p&gt;

&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;

&lt;p&gt;I don&amp;rsquo;t have an implementation right now. I was working on a prototype
previously, but using Noise &lt;code&gt;XX&lt;/code&gt; and still trying to work out what to do for
authentication. I much prefer this design, which answers those questions. There
are a bunch of practicalities that are needed to make this usable, and some
conventions and guidelines around key usage.&lt;/p&gt;

&lt;p&gt;We can see that we can use the Noise Protocol as a set of three interactive and
three one way patterns for capability based exchanges. No additional
certificates or central source of authorisation is needed other than public and
private key pairs for Diffie Hellmann exchanges. Public keys can be used as
capabilities; private keys give the ability to provide services. The system is
decentralised, encrypted and simple. And there are interesting properties of
mutual capabilities that can be used if required.&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>