Regression: Formulas for coefficients

Issue

Understand how the formula for the coefficients in the simple linear regression.

Model and Ordinary Least Square:

The formula of the model describing the relationship between Population (explained variable) and the Size (explaining variable) of the kingdoms in Game of Thrones.

$Population_{i} = A_{0}+A_{1}\times Size_{i}+\varepsilon$

Sum of Squared Error (SSE):$\sum \limits_{\underset{i \neq i_0}{i=1}}^n \varepsilon_{i}^2$

Ordinary Least Square (OLS): $Min SSE$

Ordinary Least Square (OLS): $Min \sum \limits_{\underset{i \neq i_0}{i=1}}^n \varepsilon_{i}^2$

Ordinary Least Square (OLS): $Min \sum \limits_{\underset{i \neq i_0}{i=1}}^n (Population_{i}-A_{0}-A_{1}\times Size_{i})^2$

Creation of the favicon

Favicon

I created a favicon for this website. It represents the star of the seven. It is the main religion the Westeros. Each branchs represent a god. We often see this symbol in the serie. Especially in the temple used for the religion: the sept (meaning seven in french strangely). The importance of the religion in the story get grower with the seasons… The free software Inskape was used to do it.

Regression : region’s size and population in Game of Thrones (coefficients)

Issue

We want to understand the relationship between the population and the size of the kingdoms in Westeros. Indeed, the population size plays a role in the the size of each army.
Our model is: $Population_{i} = A_{0}+A_{1}\times Size_{i}+\varepsilon$. In this post, we will detail the construction of the coefficients.

1) Get the data

We do not know any references or “official” census (from GRRM) for the population of Game of Thrones. Consequently, we will go on several forum and watch what people say about the question of population. Finally, we chose data from the website Reddit, with the post from scolbert08 published in 2014. I got the data the size of the kingdoms too. I entered the data into an Excel spreadsheet manually, and then import it into R (the read.csv2 function is for the french format csv document, for english format, use the read.csv function). The document contains more data from other website, we just want to select the variables “Climate”, “Size2” and “Population7”.Only the nine first rows are selected because we do not have data for other regions. That’s why we do not have data for Essos region, the Wall and beyond the Wall. The row names are set with name of the kingdoms. We also rename the column names to drop the 7 and 2 from the names.

df<-read.csv2("H:/Travail/GoT/regression2.csv")
df2<-na.omit(df[,c(9,13,14)])
rownames(df2)<-df[1:9,1]
colnames(df2)<-c("Climate","Population","Size")