Model vs. Algorithm

As discussion about Artificial Intelligence has become a mainstream topic, we hear a lot of terms used in loosely. One key example is model and algorithm, typically calling everything an algorithm (because it sounds more scientific?). In fact, when we are talking about AI in applications (in society) we are almost always talking about models, not about algorithms. A model is a stylized way to describe a relation between two things. Just like a London supermodel is a stylized way to show clothes.

Stanford Palo Alto

Profile Stanford, Palo Alto 2024 CC BY 40

Using Days of the Week to Understand Modulo

Today is Monday January 1st 2024, what day of the week will it be in 7 days? That is pretty simple, a week has seven days, so seven days from now exactly one week will have passed, and it will again be a Monday. That’s correct, now let’s make this slightly harder, what day of the week will it be in 14 days? That is also pretty simple, a week has seven days, 2 times 7 is 14, so in 14 days exactly 2 weeks will have passed and it will again be a Monday.

BGV in R

We begin by loading 2 packages, polynom for dealing with polynomials and my package [HomomorophicEncryption](https://CRAN.R-project.org/package= HomomorophicEncryption) which has several helper functions. library(polynom) library(HomomorphicEncryption) #> Loading required package: HEtools Set some parameters. d = 4 n = 2^d p = (n/2)-1 t = p q = 868 Set a working seed for random numbers set.seed(123) Here we create the polynomial modulo.

Computing on Encrypted Data - What? Why? Toy example

Encryption is typically used to protect information like written communication from evedropping, such as messages sent over the internet. However, encryption can also be used to encrypt data, moreso using homomorphic encryption, the data can still be computed upon while encrypted, the results will still be encrypted (but valid), and can be decrypted again by the orignal data owner/supplier. A couple of years ago I was skiing on Mont Blanc, I had a bad fall and an X-ray was made. This being a mountain village (Chamonix) there was not medical docter available, so a vetinarian (there are lots of cows in those mountains) took it. A vetinarian may be able to operate the X-ray machine, but they cannot interpret the photograph (all true up to here).

CKKS encode encrypt in R

This blog posts shows how to perform CKKS encoding and encryption, followed by decryption and decoding to obtain the original vector of complex numbers. The code uses the R package polynom for polynomials. It also uses the HomomorphicEncryption package. If you are reading this before 30 December 2023, you need to install the development version of HomomorphicEncryption from Github. If you are reading it from 30 December 2023, you can install it from CRAN (make sure it is up to date).

HEtools on PyPI

My new Python package HEtools is now on the Python Package Index (PyPI): https://pypi.org/project/HEtools/ The package implements functions for Fully Homomorphic Encryption Development takes place on Github: https://github.com/bquast/py-HEtools Bugs can also be filed on Github: https://github.com/bquast/py-HEtools/issues

libactivation on PyPI

My new Python package libactivation is now on the Python Package Index (PyPI): https://pypi.org/project/libactivation/ The package implements a series of activation functional - sigmoidal and others, i.a. the Rectified Linear Unit (ReLU) - as well as their derivatives, for various machine learning purpose, such as neural networks. Development takes place on Github: https://github.com/bquast/libactivation Bugs can also be filed on Github: https://github.com/bquast/libactivation/issues Much of the inspiration came from my 2015 R package sigmoid, which was split out of my Recurrent Neural Network framework RNN:

transformer package on CRAN

My new R package transformer is now on CRAN: https://cran.r-project.org/package=transformer The package implements a full transformer architecture, based on the attention mechanism that I previously pushed as an R package: https://cran.r-project.org/package=attention

BFV Homomorphic Encryption in R

This is an implementation of the BFV schema in R: https://gist.github.com/bquast/47d4e1363158e80bf2dd5376e4625b1c

attention package on CRAN

The attention R package, describing how to implement from scratch the attention mechanism - which forms the basis of transformers - in the R language is now available on CRAN. A key example of the results that were achieved using (much larger and more complex forms of) transformers is the change from AlphaFold (1) (which relied primarily on LSTM) to AlphaFold2 (which is primarily based on transformers). This change pushed the results in the protein folding competition CASP-14 to a level of accuracy that made the protein structure prediction accurate enough for practical purposes. A major scientific breakthrough, the impact of which can barely overstated.

Self-Attention from Scratch in R

EDIT 2022-06-24: this code is now available (with helper functions) in the R package attention, which is on CRAN. You can install it simply using: install.packages('attention') See also my blog post attention on CRAN. The development takes place on GitHub. This post describes how to implement the attention mechanism - which forms the basis of transformers - in the R language. The code is translated from the Python original by Stefania Cristina (University of Malta) in her post The Attention Mechanism from Scratch

Online Office Hours

With over a year working from home and an end not immediately in sight, I felt it was time to think a bit structurally about how to work remotely as effectively as possible. The clear missing element is the watercooler conversations / coffees at the cafeteria. Universities have to some extend always have had to deal with faculty not having a default schedule for being at their desks, which normally makes dropping in easy. The way in which this is typically dealt with is by holding office hours at a set time every week (when not traveling). Normally for the person holding office hours, when nobody comes by, this is a good time to get done with some paperwork that otherwise gets forgotten.

Ron Graham's Game

For a job interview at WHO I was asked by build a numeric version of Noughts and Crosses (Tic-Tac-Toe to some), this is called Ron Graham’s Game (repo). Ron Graham’s Game Ron Graham’s Game is a numerical variant of Naughts and Crosses / Tic-Tac-Toe. In the general form, the board is a square matrix of length L >= 3, Player 1 has stones for all the Odd numbers in the range 1:L, Player 2 has stones for all the Even numbers in the range 2:(L-1), a player wins if it completes a row/column/diagonal and the sum equals (1:L)L/2.

Tech Learn Talks

Today I gave a presentation at the UN Innovation Network’s TechLearnTalks (archived, backup): The slides from my presentation are available here: https://docs.google.com/presentation/d/1qDtY8jrMnDz3tGpqg-AvgIBB5iiY54Jko6lMDZP7c5o/ The live demo spreadsheet is available here: https://docs.google.com/spreadsheets/d/1j1dXgZ_9RzvBdKFyA1Goii6IzPniNWSLs14D-kN_RNo/ EDIT (2020-06-11): due to popular demand I am turning the spreadsheet into a somewhat more formal product: http://spreadsheet.network/ a neural network in a spreadsheet With an FAQ. Paper to follow.

Slides 2nd Homomorphic Encryption Standardization Presentation

The slides from my presentation at the Homomorphic Encryption Standardization Consortium meeting on 6 February 2020 at Microsoft Research in Seattle (archived): https://docs.google.com/presentation/d/1VakPf8i205la3lIdY7Grt1fTIg3qY9AhOobpt9EuZZs/

Slides International Actors Digital Health

The slides from my briefing on the ITU-WHO Focus Group on Artificial Intelligence for Health at the Roundtable on International Actors in Digital Health (archived): https://docs.google.com/presentation/d/1_U0ddCvdDLcbZc6D1OV1qxZYFfDCh0hTfSiDOp0kVsM/ Some photos were taken (CC-BY-$.0): ![Bastiaan Quast]({{ site.url }}{{ site.baseurl }}/assets/images/2020-01-31-International-Actors-Digital-Health.jpg)

PWA of QR code to WhatsApp

Earlier this year I was in China, and in order to communicate with local delegates I installed WeChat. WeChat has a very handy feature that replaces exchanging telephone numbers manually, and then waiting until WhatsApp picks up the new contract from the contracts app, by a QR code. One person brings up his QR code from within the app and the other person scans this code, and that’s it! you are connected.

ITU-D SG2 Briefing on FG-AI4H

The slides from my briefing on the ITU-WHO Focus Group on Artificial Intelligence for Health to Study Group 2 of the ITU Development Sector at their Workshop on AI and Emerging Technologies (archived): https://docs.google.com/presentation/d/1wSYUeP8COsCw9iQSvJ25_HcIheykEwc6g_NfrGZnzAY/ The photographer took some photos, here’s one (CC-BY-4.0): ![Bastiaan Quast]({{ site.url }}{{ site.baseurl }}/assets/images/2019-10-11-ITU-D-SG2-briefing-FG-AI4H.jpg)

homomorphic encryption in R

Homomorphic encryption is allows computations to be performed on encrypted data. This has enormous potential in areas of machine learning that deal with private data, such as medical records. Below is an implementation of homomorphic encryption in R. It encrypts two pieces of data m=10 and m1=2, once they are encrypted (as cipher and cipher2 respectively), the two encrypted forms can be added up together (cyphertotal). They can then be descrypted to reveal mess2 to equal 12 (i.e. the sum of 10 and 2).

%b %e, %Y homomorphic-encryption

Slides Homomorphic Encryption Standardization

The slides from my presentation at the Homomorphic Encryption Standardization Consortium meeting on 17 August 2019 at Intel HQ in Santa Clara: https://docs.google.com/presentation/d/1cfpOLsL0f6Okm4c_dKEArDNMgryjDloDtf8dzSwOZZU/

Backpropagation: the simplest form

doing backpropagation using pen and paper We begin with a simple problem, we have two guys, Joris and Carl, who both want to be tall. Joris is dutch and so he drinks a pint of milk a day, so that he may grow tall. Carl is Swedish and so he eat meatballs with jam to grow tall. We can represent this in a matrix: `X = [[1,0],[0,1]] ` {: .kdmath}

Compiling TensorFlow on Arch Linux

Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA The above notification keep popping up whenever you use TensorFlow to remind you that your models could be training faster if you used binaries compiled with the right configuration. When TensorFlow first came out, it was available as a package, it became available to Arch Linux users as a package in the Arch User Repository, meaning that it was compiled on your local system.

Promoting Content in Africa

In the keynote at African Peering Forum (AfPIF) 2016 I presented the Promoting Content in Africa report written together with Michael Kende, these are the slides. or otherwise here: https://drive.google.com/file/d/11uHhY4yLrvBbmf4N2aZm0SE1aUyoDhWt/ The recording is available here (archived, backup): Blog posts The report is accompanied by a blog post on local content creation and a blog post on local content availability, each of which summarises one of the chapters of the report.

Making the Next Billion Demand Access

The Local-Content Effect of google.co.za in Setswana I presented this paper at EEA 2016 in Geneva Abstract This paper shows that an exogenous increase in accessibility of local language content leads to an increase in demand for internet connectivity among native speakers. Internet connectivity provides enormous improvements in quality of life as well as opportunities for the newly connected, yet recent attempts to connect the current ’next billion’ in places such as sub-Saharan Africa have not met expectations. In places where infrastructure has come online and prices have gone down, the expected consequent increase in usage was not observed. The introduction of the Setswana language in the South-African Google Search website was a spillover of the Botswana Google search website being translated from English to Setswana. This exogenous improvement in the accessibility of Setswana-language content has resulted in a substantial increase in the number of native Setswana speakers coming online and owning personal computers. It has also led to increased usage of the Setswana language online, creating a positive-feedback loop. This suggests that connecting the fourth billion will require a greater focus on the demand-side of connectivity, specifically by means of local content.

rnn: a Recurrent Neural Network in R - on CRAN

The rnn R package, describing how to implement from scratch the Recurrent Neural Network algorithm) in the R language is now available on CRAN. The rnn package can be installed simply from within R by running: install.packages('rnn') It does not have any dependencies, C++, Fortran, Java, or any other type of complexities. It is written purely in base R, so it should install without any issue on any R version.

%b %e, %Y neural-networks

sigmoid package

The sigmoid package makes it easy to become familiar with the way neural networks work by demonstrating the key concepts using straightforward code examples. Installation The package can now be installed from CRAN using: {% highlight r %} install.packages(‘sigmoid’) # case sensitive! {% endhighlight %} Usage After installation, the package can be loaded using: {% highlight r %} library(sigmoid) {% endhighlight %} For information on using the package, please refer to the help files.

%b %e, %Y neural-networks

Coding a Recurrent Neural Network from Scratch in R

This is an example of how to build a Recurrent Neural Network in R. What we will do is generate two sets of random numbers, a and b, which we will sum up, the sum will be stored in the variable c. We then convert a and b into 8-bit binary values. E.g. 5 becomes: 0 0 0 0 0 1 0 1. We then sum the binary formatted a and b, bit by bit (i.e. column by column).

%b %e, %Y neural-networks

Handcoding a Logit Model

Below is an example of how to handcode a logit model. {% highlight r %} set.seed(123) n=100 N=1+rpois(n,5) X1=runif(n) X2=rexp(n) s=X2-X1-2 p=exp(s)/(1+exp(s)) vY=NULL for(i in 1:n){ Y=rbinom(1,prob=p[i],size=N[i]) vY=c(vY,Y) } db=data.frame(Y=vY,N=N,X1,X2) head(db,4) {% endhighlight %} {% highlight text %} Y N X1 X2 1 3 5 0.5999890 1.80312932 2 1 8 0.3328235 0.03005945 3 0 5 0.4886130 1.30344055 4 0 9 0.9544738 0.19979224 {% endhighlight %}

Handcoding a Difference in Differences

In this post we will discuss how to manually implement a Difference-in-Differences (DiD) estimator in R, using simulated data. {% highlight r %} reproducible random numbers set.seed(123) untreated and treated independent variable for period 0 xutr <- rnorm(1000, mean=5) xtr <- rnorm(1000, mean=1) create a data.frame with the dep. var., indep. var., time and id vars for period 0 dfutr <- data.frame(time = 0, id= 1:1000, y=xutr+15+rnorm(1000), x=xutr) dftr <- data.frame(time = 0, id=1001:2000, y=xtr +9+rnorm(1000), x=xtr ) df0 <- rbind(dfutr, dftr)

Handcoding a Panel Model

The most basic panel estimation is the Pooled OLS model, this model combines all data across indices and performs a regular Ordinary Least Squares Estimation. {% highlight r %} load the PLM library for panel estimation library(plm) load the Crime data set data(Crime) {% endhighlight %} {% highlight r %} define the model m1 <- formula(crmrte ~ prbarr + prbconv + polpc) create a panel data.frame (pdata.frame) object PanelCrime <- pdata.frame(Crime, index=c(“county”, “year”) )

Neural Network IV with Simulated Data

Some simulated data, borrowed from this post. {% highlight r %} library for generation multivariate distributions library(MASS) always use the same random numbers set.seed(123) the means and errors for the multivariate distribution MUs <- c(10,15) SIGMAs <- matrix(c(1, 0.5, 0.5, 2 ), nrow=2, ncol=2 ) the multivariate distribution mdist <- mvrnorm(n = 1000, mu = MUs, Sigma = SIGMAs) create unobserved covariate c <- mdist[ , 2]

%b %e, %Y neural-networks

Neural Network Instrumental Variables

A simple example {% highlight r %} library(AER) data(“CigarettesSW”) rprice <- with(CigarettesSW, price/cpi) tdiff <- with(CigarettesSW, (taxs - tax)/cpi) packs <- CigarettesSW$packs {% endhighlight %} Estimate using OLS. {% highlight r %} lm(packs ~ rprice) {% endhighlight %} {% highlight text %} Call: lm(formula = packs ~ rprice) Coefficients: (Intercept) rprice 222.209 -1.044 {% endhighlight %}

%b %e, %Y neural-networks

Hand Coding Instumental Variables

In a previous post we discussed the linear model and how to write a function that performs a linear regression. In this post we will use that linear model function to perform a [Two-Stage Least Squares estimation]. This estimation allows us to […] Recall that we built the follow linear model function. {% highlight r %} ols <- function (y, X, intercept=TRUE) { if (intercept) X <- cbind(1, X) solve(t(X)%%X) %% t(X)%*%y # solve for beta } {% endhighlight %}

learNN package

The learNN package makes it easy to become familiar with the way neural networks work by demonstrating the key concepts using straightforward code examples. The package is based on previous post on this blog. Andrew Trask wrote an amazing post at I am Trask called: A Neural Network in 11 lines of Python In the post Hand Coding a Neural Network I’ve translated the Python code into R. In a follow up post called:

%b %e, %Y neural-networks

Linear Model and Neural Network

In this short post I want to quickly demonstrate how the most basic neural network (no hidden layer) gives us the same results as the linear model. First we need data {% highlight r %} data(swiss) str(swiss) {% endhighlight %} {% highlight text %} ‘data.frame’: 47 obs. of 6 variables: $ Fertility : num 80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 … $ Agriculture : num 17 45.1 39.7 36.5 43.5 35.3 70.2 67.8 53.3 45.2 … $ Examination : int 15 6 5 12 17 9 16 14 12 16 … $ Education : int 12 9 5 7 15 7 7 8 7 13 … $ Catholic : num 9.96 84.84 93.4 33.77 5.16 … $ Infant.Mortality: num 22.2 22.2 20.2 20.3 20.6 26.6 23.6 24.9 21 24.4 … {% endhighlight %}

%b %e, %Y neural-networks

Hand Coding Hilton's Dropout

Andrew Trask wrote an amazing post at I am Trask called: A Neural Network in 11 lines of Python In the post Hand Coding a Neural Network I’ve translated the Python code into R. In a follow up post called: A Neural Network in 13 lines of Python Andrew shows how to improve the network with optimisation through gradient descent. The third post called: Hinton’s Dropout in 3 Lines of Python explains a feature called dropout. The R version of the code is posted below.

%b %e, %Y neural-networks

Hand Coding Gradient Descent

Andrew Trask wrote an amazing post at I am Trask called: A Neural Network in 11 lines of Python In the post Hand Coding a Neural Network I’ve translated the Python code into R. In a follow up post called: A Neural Network in 13 lines of Python Andrew shows how to improve the network with optimisation through gradient descent. Below I’ve translated the original Python code used in the post to R. The original post has an excellent explanation of what each line does. I’ve tried to stay as close quto the original code as possible, all lines and comments correspond directly to the original code.

%b %e, %Y neural-networks

Coding a Neural Network from Scratch in R

Andrew Trask wrote an amazing post at I am Trask called: A Neural Network in 11 lines of Python Below I’ve translated the original python code used in the post to R. The original post has an excellent explanation of what each line does. I’ve tried to stay as close to the original code as possible, all lines and comments correspond directly to the original code. The code for the Neural Network in 11 lines of R is:

%b %e, %Y neural-networks

Hand Coding the Power of a Test

We want to test if our population average is different from twenty. We therefore specify the follow zero hypothesis. $$ H_0: \mu = 20 $$ The alternative hypothesis is $$ H_a: \mu \neq 20 $$ Accepting {% highlight r %} mu <- 40 sd <- 8 n <- 20 ci <- qnorm(0.975)*sd/sqrt(n) (left <- mu - ci) {% endhighlight %} {% highlight text %} [1] 36.49391 {% endhighlight %}

WIOD data sets package

The wiod package is now available on CRAN. The package contains the complete WIOD data sets, in a format compatible with the decompr and gvc package. Installation The package can be installed using: {% highlight r %} install.packages(‘wiod’) # case sensitive! {% endhighlight %} Usage Following installation, the package can be loaded using: {% highlight r %} library(wiod) {% endhighlight %} Data can be loaded using the the data() function, using wiod followed by the last two digits of the required year, as the argument, e.g.

The Linear Model from Scratch in R

When it comes to econometrics, the main take aways from the workshops are primarily in terms of the syntax of yet another computer program. The Linear Model Then using the Ordinary Least Squares approach to solving a model, we start with the following equation of the OLS model for a univariate regression. $$ y_i = \beta_0 + \beta_1 x_1 + \epsilon $$ This can be solver for the following (hat denotes the estimator, bar denotes the mean):

introducing diagonals

A new R package diagonals is now available on CRAN. The package implements several tools for dealing with fat diagonals on matrices, such as this one: {% highlight text %} [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [1,] 1 1 1 1 0 0 0 0 0 0 0 0 [2,] 1 1 1 1 0 0 0 0 0 0 0 0 [3,] 1 1 1 1 0 0 0 0 0 0 0 0 [4,] 1 1 1 1 0 0 0 0 0 0 0 0 [5,] 0 0 0 0 1 1 1 1 0 0 0 0 [6,] 0 0 0 0 1 1 1 1 0 0 0 0 [7,] 0 0 0 0 1 1 1 1 0 0 0 0 [8,] 0 0 0 0 1 1 1 1 0 0 0 0 [9,] 0 0 0 0 0 0 0 0 1 1 1 1 [10,] 0 0 0 0 0 0 0 0 1 1 1 1 [11,] 0 0 0 0 0 0 0 0 1 1 1 1 [12,] 0 0 0 0 0 0 0 0 1 1 1 1 {% endhighlight %}

plot.ly

Quick experiment on embedding plot.ly graphics. {% highlight r %} library(ggplot2) library(plotly) {% endhighlight %} Basic plotly {% highlight r %} plot_ly(iris, x = Petal.Length, y = Petal.Width, color = Species, mode = “markers”) {% endhighlight %} {% highlight text %} Error in html_screenshot(x): Please install the webshot package (if not on CRAN, try devtools::install_github(“wch/webshot”)) {% endhighlight %} Now using ggplot2 {% highlight r %} ggiris <- qplot(Petal.Width, Sepal.Length, data = iris, color = Species) ggplotly(ggiris) {% endhighlight %}

diagonals in a network

A typical example in which diagonals can be helpful is Social Network Analysis. For example, if we use matrices to represent friendship perceptions between individuals, then we need a dyadic matrix. {% highlight r %} generate a dyadic matrix for 3 individuals m <- matrix(sample(0:1, 9, replace=TRUE), nrow=3, ncol=3) m {% endhighlight %} {% highlight text %} [,1] [,2] [,3] [1,] 1 1 1 [2,] 1 1 0 [3,] 0 1 0 {% endhighlight %}

Male/Female Bargaining Power and Child Growth

Increased male bargaining power in households causes greater expenditure on food, an improvement in Weight-for-Age Z-scores in young children, and a deterioration in Height-for-Age Z-scores in very young children, as observed in the context of South Africa’s 2010 state pension expansion for males. In 2010 the male eligibility age for the South-African state pension was brought to par with female eligibility age (60, previously 65). I exploit this policy change in order to estimate the effect of the increased male bargaining power in the household, on growth of young children living in the same household, as well as food expenditure. The policy change took place shortly after the completion of the first wave of South Africa’s National Income Dynamics Survey and shortly before the start of the second wave, which lends itself well for a Difference-in-Differences approach on the right hand side. On the left hand side I use z-scores of growth anthropometrics of young children in the household (against WHO standards) as well as food expenditure.

gvc package on CRAN

A new R package gvc is now available on CRAN. The package implements several global value chain indicators Importing to Export (i2e(), a.k.a. vertical specialization) Exporting to Re-export (e2r()) New Revealed Comparative Advantage (nrca()) As well as several other tools. The gvc package can now be install directly from R using: {% highlight r %} install.packages(“gvc”) {% endhighlight %} In addition to this, a development version is available on GitHub, this version is to be used at your own risk, it can be install using:

decompr on CRAN

I am proud to announce that after a few emails back and forth with Prof. Brian Ripley, which consisted mostly of me appologising for not following the proper procedure for submission, I received an email announing that my decompr package is now available on CRAN. The package can now easily be installed using: {% highlight r linenos %} install.packages(“decompr”) {% endhighlight %} The version published contains several updates, most importantly, I used a regional input-output table from the WIOD project, which is substantially smaller and makes the decompositions significantly faster.

Data Science Specialisation

Yesterday the Johns Hopkins School of Public Health published a post about their Data Science Specialisation on the online MOOC platform Coursera. The post metiones the first batch 266 students finishing the specialisation (among them, me :-) ). In total more than 800,000 people have registerd for one of the courses, out of which 14,000 finished at least one. The Specialisation The Data Science Specialisation consists of nine courses and a capstone project (which is was announced, but is yet to open for registration). The courses are:

Learning R and Git

In yesterday’s post I discussed some of the principles I use to make my work replicable and - to an extent - reproducible. In this post I want to collect some resources which I think will be very helpful in getting a grasp of the basics of R and Git. R An easy way to start with R without having to install it is using the online tutorial Try R by Code School.

Replicable Development Economics

The tagline of this blog says something replicable development economics using R and git. So far, I have posted gimmicks on new R tools such as shiny, rmarkdown, and my own package. Also, I have posted on how to use Git, Github, and Jekyll to write a website/blog. However, I have never brought the two together and how this features into creating replicable research. In this post I will briefly describe what Git and R are, and how I use them for my work. I hope to post something tomorrow about useful resources for mastering both these tools (tomorrow’s post).

profile photo

The below image is licenced under CC-BY-4.0

A jekyll blog

What are jekyll, markdown, and git(hub)? and why would you need all of this for a blog, in stead of a simple Blogspot of Wordpress page? The short answer is more control, by having fewer and more transparent layers, you retain more control over the content and layout of your blog. Since launching this blog last week, I have received a number of question about how to set up something similar. Below I briefly describe the main steps for setting up a jekyll blog, and tomorrow I will go into the details of how I customised this one.

The decompr package

I am proud to announce the beta version of the decompr R package. The package implements Export Decomposition using the Wang-Wei-Zhu (Wang, Wei, and Zhu 2013) and Kung-Fu (Mehrotra, Kung, and Grosky 1990) algorithms. It comes with a sample data set from the WIOD project, and has its own mini site. Update, the decompr package in now available on CRAN, also announced in this post Inputs The package uses Inter-Country Input-Output (ICIO) tables, such as World Input Outpt Database (Timmer et al. 2012).

ggvis, shiny, and HTML5 slides

ggvis is wonderful new tool to create interactive graphics, which was build with Shiny apps in mind. In this post I will go over how you can create a Shiny app using ggvis and incorporating the ‘app’ in an rmarkdown slideshow (interactively). Sepal-Modeling is a shiny app (repo), which uses ggvis to fit LOESS smoothers on the sepal ratios of the iris dataset. There are separate smoothers for every species, as well as a general smoother for all observations. The span can be adjusted in order to see if we need to model the sepal ratio per species or if we can just model it jointly.

Hand Coding a Linear Model function

In yesterday’s post we developed a method for constructing a multivariate linear model with an intercept. Today we will turn the collection of loose commands into an integrated and easy to use function. A small recap from yesterday. We start by loading data and assigning our variables to objects {% highlight r %} data(iris) x1 <- iris$Petal.Width x2 <- iris$Sepal.Length y <- iris$Petal.Length {% endhighlight %} We now construction our linear model, the fastest way of doing this is using the QR decomposition.

Hand Coding Categorical Variables

In last week’s posts we discussed handcoding a linear model and writing a convenient function for this, in today’s post we will take this a step further by including a categorical variable. Swiss life Since I live in Geneva we will use a built-in data set that is close to home. {% highlight r %} data(“swiss”) {% endhighlight %} This data set compares fertility rates in 47 different French-speaking regions (sub-Cantonal) of Switzerland around the year 1888 (for more information see help("swiss")).

Hand Coding Instrumental Variables

The Instrumental Variables approach…. $$ 1 + 1 $$ We can solve this…