WebRTC is designed for peer-to-peer connections, which means data such as video or audio streams can flow directly between two peers instead via a server. Setting up such a peer-to-peer connection is not inherently difficult, but the standard allows for many options to be configured.
WebRTC connections require a fair amount of configuration. When confrontent with the connection setup for the first time, it is easy to get lost in the details, which is why I want to provide a high-level overview about which parts are involved in a WebRTC peer-to-peer connection and how these parts interact.
Setting up a WebRTC peer-to-peer connection can roughly be divided to three steps:
The figure here schematically depicts these four steps.
Before we can talk about creating a peer-to-peer connection, we first need to understand the nature and the role that the Signalling channel has for the connection setup.
The Signalling channel is an integral part of the procedure to set up a WebRTC peer-to-peer connection. Via the Signalling channel, peers can send SDP offers, SDP answers and ICE candidates to each other.
It is important to understand, that - while essential for the connection setup - the signalling channel is not part of WebRTC. So any connection between to peers can be used for signalling, e.g. theoretically nothing is stopping you from using email to set up your WebRTC connection. Typically signalling is done via a signalling server.
First, the “calling” peer A needs to create an Session Description Protocol (SDP) offer and send it via the Signalling channel to the other peer B. The SDP is an RFC that describes multimedia sessions for the purpose of creating multimedia sessions. SDP offers and answers have the same structure and contain information such as the media types, media formats, or transport protocols to be used.
Secondly, once the “receiving” peer B has received an SDP offer, an SDP Answer has to be created and returned to A via the Signalling channel.
Then, both peers request Internet Connection Establishement (ICE) candidates from one or many Session Traversal Utilities for NAT (STUN) servers. ICE candidate contain a description of the public facing internet connection endpoint of each peer. Such a description can later be used for creating a peer-to-peer connection. The process of receiving ICE candidates is called “gathering”, and typically it takes a while until it is completed.
There are many public available STUN servers (e.g. here) that can be used for your WebRTC projects. If a peer is not available via NAT, “Traversal Using Relays around NAT” (TURN) servers can be used as fallback. TURN server relay packets between peers and theoretically should always work. However, such servers have high bandwidth requirements as they route all traffic through them, so they can easily become a bottleneck for a WebRTC application with many clients.
Finally, peers send their collected ICE candidates to the other peers via the Signalling channel. Typically, instead of waiting until the gathering is completed, peers send ICE candidates they have received directly to the other peer, until a connection can be established. So Step 3 and 4 are not strictly performed in order, but typically happen in parallel and are repeated.
Setting up an WebRTC peer-to-peer connection consists of three high level steps - creating and sending an SDP offer, creating and sending an SDP answer, and gathering from a STUN/TURN server and sending ICE candidates to the other peer. Offers, answers, and ICE candidates are sent and received via a Signalling channel, which is an arbitrary communication channel between two peers.
]]>TensorFlow offers mechanisms to learn deep learning models from sequences with varying lengths. For example, LSTMs – one of the most prominent deep models for sequential data – are implemented in TensorFlow using a tf.while_loop, which dynamically supports models up to arbitrary length. Dynamically in this context means that the sequence lengths do not need to be known beforehand.
TensorFlow’s while mechanism is very flexible, as it neither needs to know the length of the sequences nor the size of the mini-batches beforehand. However, one downside of this mechanism is that only sequences of similar lengths can be batched together, thus making the learning efficient.
To enable efficient learning for learning tasks where elements have varying lengths, TensorFlow (experimentally) provides a method, that groups and pads sequences to certain buckets by lengths. So for example, if you want to have two buckets of lengths 5 and 10, all sequences in your data set will be associated with either one of the buckets and padded until they have either length 5 or 10.
Padding is useful for sequences that have a beginning and an end, however, it does not make sense for sequences are of unlimited lengths and hence the padding values are unknown. In such a scenario, truncating sequences is more useful.
In this post, I describe how to implement a TensorFlow method that truncates sequences and groups them by lengths so that models can be learned more efficiently. A gist with the complete sample code can be found here.
Our goal is to create a TensorFlow method that puts sequences into “buckets” of a certain length, and truncates them to the minimum length of a bucket. A bucket is defined by a minimum and maximum sequence length. For example, we could define two buckets, 5-9 and 10-14. A sequence of length 6 would be put into the first bucket and truncated to 5, and a sequence of length 13 put into the second bucket and truncated to length 10.
TensorFlow experimentally supports grouping datasets using the group_by_window function, which internally uses a MapReduce algorithm to group a dataset using a given criterion. This means that for our goal we need two functions: a map function that tells us in which bucket a sequence goes, and a reduce function that tells us how we should create the batches for each bucket. The reduce function is also the function where we truncate the sequences.
Additionally, we either need a function that tells us the batch size for each bucket or specifies the batch size for all buckets. Specifying the batch size may help us to use all memory of our GPUs, as we can specify larger batches for smaller sequence lengths and smaller batches for larger sequence lengths.
For the implementation of our solution, we follow the structure of our solution sketch. We first describe the Map function, which maps a sequence to a bucket by length. We then describe the reduce function, that truncates and batches the sequences.
The goal of our map function is to map sequences to a bucket given the sequence length. So for example, given two buckets, bucket 1 with 10-15 and bucket 2 with 15-20, map a sequence with length 11 to bucket 1 and a sequence with length 18 to bucket 2.
In Python, such a method could be easily implemented with a few lines of code, for example:
In TensorFlow, the base unit for calculations are Tensors and a computation graph, which offer the advantage to perform calculations efficiently in batches on a GPU, however, it requires a bit of rethinking at the implementation. One way to implement that logic is by using the TensorFlow where function, which tells us, at which position a certain condition is true. To that end, we need to define two arrays with the bucked boundaries, one containing the lower bounds and one containing the upper bounds, and then use TensorFlow’s logical conditions to define the condition. Finally, we can remove the extra dimension that was introduced by querying the arrays using reduce_min (or any other function that provides similar functionality).
An implementation of the described code could look like this:
The reduce function here takes care of creating the batches and truncating the sequences. The group_by_window function will return a dataset (that follows TensorFlow’s DataSet API convention) for each bucket. So in this function, we need to truncate the sequences and create the batches.
One way to truncate the sequences of the dataset is by using TensorFlow’s slice function, which simply can be applied to the dataset using a lambda function.
Batching can be achieved using the batch method. An implementation of the reduce function could look like this:
Defining the batch size using a function allows us to dynamically change the batch size based on the bucket definition, so potentially making your code perform better, as the memory on your GPU can be maxed out.
One of the simplest forms of such a function takes the bucket id and maps it to a batch size.
TensorFlow’s DataSet API requires us to define a function that can be applied to a dataset to perform a transformation, rather than the result of a transformation. Such a function takes only a dataset, and it performs the custom transformation if called. Here, we only call apply the group_by_window function, that takes the map and reduce functions as parameters. An implementation of such an application function could look like this:
In this blog post, I described how to implement a dataset transformation that groups sequences to predefined buckets by length, and truncates them to the minimum length of such a bucket.
A gist with the complete sample code can be found here.
]]>One thing that bugged me is that typically this file is just a plain text file lying around on your computer to be read. Thus, a while ago I decided to implement a storage layer that features password protection that encrypts the file. This blog post introduces the library (sources), which has been released under MIT license.
The implementation of the encrypted JSON storage is pretty straight forward, as the TinyDB storage API only requires you to implement four methods: init, read, write, and close. Each of these four methods has does what the name indicates - init sets up the storage, read loads the data from the file into the memory, write dumps the data from memory to the file and close tears down the storage layer, e.g. closes file pointers and such. In addition to that, I added a method for changing the password.
Init Initializing encrypted storage takes care for the setup. I use AES encryption with 256 bit keys, cipher block chaining (CBC) and a random initialization vector (IV). The IV will be stored at the beginning of the begin of the encrypted database file, alongside the length of the file. The IV will be loaded from the database if it does not exist, and created otherwise. To ensure the key has a length of 256 bit, I hash it using SHA256, which is at the moment of writing a NIST-approved secure hash algorithm.
Read Reading the encrypted data is pretty straight forward. TinyDB requires the read function to pass all the database’s data, so to implement the read functionality you need to decrypt the database, read it an pass it on.
Write Writing the encrypted storing is a bit more problematic than reading, because if anything happens during the write process (e.g. an exception, or a power outage), the whole database would be corrupted. With that in mind, I’ve designed the write method such that it creates a backup of the encrypted storage before it starts to write, which can be used for recovery in case of unforseen events.
Close Closing the encrypted database only requires to close the file handle.
Changing the password Changing the password is – similarly to writing the data – a bit problematic for the case when something unexpected happens. That is why I first create a new database which copies all the data from the old database to the new one, and once that has successfully finished I replace the old database with the new one. Finally, I reinitialize the database with the new encryption key.
TinyDB is built on top of PyCryptodom and TinyDB, which require Python 3.4 and Python 3.5, respectively, which means that you need to support Python 3.5 to use this plugin.
Installation You can install the plugin using the python package manager pip, simply by executing the following command or adding it to your requirements file.
Creating a database To use the encrypted JSON storage, you need to set the storage argument of the TinyDB constructor to EncryptedJSONStorage and specify an encryption key.
Changing the encryption key For changing the encryption key, you simply need to access the storage property of your database and call the change_encryption_key method.
TinyDB is a python database that allows you to add hassle-free persistence to small projects. Here, I introduced a storage plugin for TinyDB that allows you to store your data as encrypted JSON. It can be installed via pip, and using it requires only a few lines of code.
]]>Vue pre-processors take .vue files and compile them to JavaScript, HTML, and CSS. Jekyll in contrast takes kramdown files and HTML templates and renders them to JavaScript, HTML, and CSS. Obviously, neither of the two can take the output of the other as input, so you cannot simply chain them.
Another problem is, that if you build Vue files for a production environment, the JavaScript and CSS files will get minified, which means you cannot easily access them from a tag of your Jekyll posts, as all the names of your objects and functions get changed.
I can think of three workarounds that allow you to integrate Vue apps to your Jekyll pages:
The first solution is IMO the most elegant, but it is also the one that requires the most work. The second option is easier to implement, but offers less flexibility building apps directly within blog posts, because the App is developed outside of the blog post. The third option allows to develop structured Vue apps within a blog post but requires to split your Vue app into multiple component libraries and compile each one of them separately.
My main motivation is to create Vue apps composed of multiple components directly in a blog post, so I’ll stick with the third option here.
To export your components as libraries, you need to do two things: add an entry point for your library, and then compile them as a library. To use them in a Jekyll post, you need to include the JavaScript- and the CSS file in your blog post via according tags.
Add an entry point to your component. The purpose of the entry point is to allow you to define the exports of your component and to register your component in Vue. So to create a component library, create a folder, move your component(s) to this folder and create an index.js. An example of an index.js could look like this:
In this brief example, I register and export one component: MyComponent.vue. If I wanted to add more components, I would need to import them and add them to the components object. The purpose of the registration is to allow you to access your component after the JavaScript gets minified.
Compile component as a library. To compile the component as a library, you need to invoke the vue-cli-service build command with “–target lib” parameter. The most convenient way to do so is to add a script in your pachage.json. This build script should point to the index.js of your component library. An example of that could look like this:
This example would create compiles Vue component and creates the my-component .js and .css files in the output folder. The output folder is typically specified in the vue.config.js file.
Include stylesheets. Next, one needs to include the stylesheet component files in the blog post. The stylesheet can be included in the body tag. While it is considered to be best practice to include stylesheets in the head element, link elements that contain stylesheets are actually body-ok thus the page will be loaded correctly.
Define Vue app. Next, to use your component in a blog post, you need to define the Vue app there. As described in my previous post, the container element needs to be encapsulated with raw/endraw tags, but other than that it is straight forward.
Finally, you need to load the Vue-app, which follows a fairly standard way of doing so. The only thing you have to keep you need to the module attribute of the *
Multiple component libraries. For adding multiple component libraries to your project, you have to repeat the previous four steps for each of the libraries. Compiling multiple libraries by hand one after each other can get cumbersome. To build multiple libraries at once, you can use the npm-run-all-package. It allows you to define build steps that run other build steps.
For example, suppose all the build steps for your libraries are prefixed with “-lib”. Then using the npm-run-all package you could add another build script that invokes all other build scripts referenced by a glob-like naming. So in our example, “build-lib-*” would invoke all build steps that start with “build-lib-“
Vue allows us to define reusable components, but they need to be pre-processed to be included in a web-page. This pre-processing poses a problem when combined with other content rendering engines such as Jekyll, which also needs to process files to create something displayable.
In this post I’ve outlined a strategy that allows you to use Vue component libraries directly in Jekyll posts. To do so, you need to define and compile the Vue components as component libraries and then integrate load them from your blog post.
The described strategy allows you to develop your Vue app directly in your blog posts and to reuse your Vue components. The downside of this strategy is some configuration overhead and Vue apps get more difficult to debug.
]]>In this blog post, I’ll briefly explore how to integrate Vue.js components in your Jekyll blog. The marriage of these two frameworks allows you to reuse your blog’s components in some other web app, or vice versa, reuse other components in your blog. So, let’s get our hands dirty.
I tried this for Jekyll 4.0.0 with kramdown as markdown renderer, using ruby 2.6.3 and Vue.js 2.x.
The easiest way to integrate a Vue app to an existing, static HTML site requires only three simple steps: add the Vue library to the site, add a DOM element for the Vue app to live in and finally to define the Vue app. Integration with Jekyll is just a little more complicated than that.
Our first step is to include the Vue library to our site. To do so, we simply have to add a <script>
tag to the layout file that we use for our page.
Vue offers two different versions of its library, development and production. The development version is optimized for debugging and understandability, whereas the production version is optimized for speed. Since I don’t want to change the link all the time, I will make the version of the Vue library dependent on the Jekyll environment.
Then, we need to define the DOM element where the Vue app can live in. Typically, that is a <div>
with an id ‘app’, but it can be any valid DOM element with any valid selector. By default, kramdown filters curly braces, so the DOM element in kramdown needs to be preceded by a raw directive and succeeded by an endraw directive, which stops the kramdown from interpreting the content.
Please note that in the code sample before there is an extra space in between the opening curly brace and the percentage sign which needs to be removed in your Jekyll post. It should look like this {% endraw %}. However, since this is a Jekyll page, I cannot use the directive directly, because it would the previous raw directive.
Finally, we need to define our Vue.js app in our page. We’ll stick to the hello world example. Kramdown supports JavaScript, so adding JavaScript works like in a regular HTML page by simply have to put it between <script> </script>
tags.
That’s it. Your post should now display “Hello Jekyll”.
One of the main strength of Vue.js is that it allows creating component-based web apps. Such components can be viewed as building blocks of web apps, which is pretty neat. Adding components to a Jekyll post is pretty straight forward. The only thing you have to be aware of is that curly braces need to be escaped. So if you define your component within a script tag, then you need to add surrounding {% endraw %} directives.
The component definition needs to be included before adding the app definition, but after including the Vue.js sources. Alternatively, you can also include your component in a .js file, for example like this:
Integrating Vue.js like I described before allows you to quickly add some nice Vue-powered apps to your blog post. This is nice for some small additions to your page, but as soon as your app becomes more complex, single file components may be better suited.
Single file components allow you to nicely encapsulate your Vue components in a file. Typically, you need to run a Vue pre-processor to translate such single file components into plain JavaScript, HTML and CSS which can be read by a browser.
Jekyll is also a pre-processor, which takes kramdown- and template files to create some HTML, JavaScript and CSS that browsers can read. Running two pre-processors that are not built for each poses some obvious problems.
There is a simpler, yet hackier way to include the single file components, namely via using the http-vue-loader-library. To do so, all you need to do is include the http-vue-library before instantiating the Vue app and include your componets via the httpVueLoader-method.
]]>