Introduction to my Data Analysis Final Project¶

In this Research Project I would like to explore data related to the stock market and what factors have the greatest influence on wether a company's stock price goes up or down. In recent years, it has been fervently debated regarding wether stocks are based on their financials (especially with the rise of growth stocks with high ceilings such as the new tech companies). According to economics, the concept of price equilibrium and long run self-adjustment means that stock prices should reflect the value of the company itself. Does this statement still remain true?

In this project, I will be using multiple types of Machine Learning techniques such as blah to find out which factors are the most important and wether there even is a correlation between financial success and stock price.

Import Libraries

from mpl_toolkits.mplot3d import Axes3D
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt # plotting
import numpy as np # linear algebra
import os # accessing directory structure
import pandas as pd # data processing

Exploring Dataset

df1 = pd.set_option('display.max_columns', None)
df1 = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
df1.dataframeName = '2014_Financial_Data.csv'
nRow, nCol = df1.shape
print(f'There are {nRow} rows and {nCol} columns')
print(df1.dtypes)
df1.info()
df1.head(5)

There are 3808 rows and 225 columns
Unnamed: 0               object
Revenue                 float64
Revenue Growth          float64
Cost of Revenue         float64
Gross Profit            float64
                         ...   
R&D Expense Growth      float64
SG&A Expenses Growth    float64
Sector                   object
2015 PRICE VAR [%]      float64
Class                     int64
Length: 225, dtype: object
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3808 entries, 0 to 3807
Columns: 225 entries, Unnamed: 0 to Class
dtypes: float64(222), int64(1), object(2)
memory usage: 6.5+ MB

As can be seen from the data shown above the dataset has 3808 rows (3808 companies tracked) and we have 225 columns of factors with the last two columns being the price YOY change and the Class (which represents wether the stock went up or down with 1 meaning up and 0 meaning down). Both these factors are predictive factors and we can use them to find out which factors have the greatest correlation with PRICE VAR or CLASS. Almost all of the columns show float datatypes except the name (object), the sector (object) and the Class (int). Therefore to preprocess this data we must first make some adjustments to columns: Unnamed and Sector.

#remove data with NAN 
#change unnamed
#change sector
df1.rename(columns={'Unnamed: 0':'Stock Name'}, inplace=True)
data_info = pd.set_option('display.max_columns', None)
data_info=pd.DataFrame(df1.dtypes).T.rename(index={0:'column type'})
data_info=data_info.append(pd.DataFrame(df1.isnull().sum()).T.rename(index={0:'null values (nb)'}))
data_info=data_info.append(pd.DataFrame(df1.isnull().sum()/df1.shape[0]*100).T.
                         rename(index={0:'null values (%)'}))
display(data_info)

#explore sector, pricevar and class
df1.describe()

df1.describe(include=np.object)

Linear Regression: Our first objective will be to find out if there is a linear correlation between any of the factors and PRICEVAR or the factors and Class. I had a hypothesis that out of all the variables, revenue growth would have the greatest effect on the final yoy change in stock price. However, as can be seen below, there seemed to not be much of a correlation.

dfSector = pd.get_dummies(df1["Sector"])
dfClean = df1
dfClean.drop('Sector',
  axis='columns', inplace=True)
dfClean = df1.dropna(subset=['Class','2015 PRICE VAR [%]'])
dfClean = pd.concat([dfClean,dfSector], axis=1)
dfVAR = dfClean['2015 PRICE VAR [%]']
dfClass = dfClean['Class']
dfClean.drop(['2015 PRICE VAR [%]', 'Class'],
  axis='columns', inplace=True)

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
model = LinearRegression()
x = dfClean['Revenue Growth']
x= x.fillna(x.mean())
X = pd.concat([x,dfVAR], axis=1)
y = dfVAR.to_numpy()
plt.plot(X, y)
model.fit(X, y)
y_predicted = model.predict(X)
plt.plot(X, y_predicted)
plt.show()
# print(X)
from yellowbrick.regressor import ResidualsPlot
from sklearn.metrics import mean_squared_error
# Instantiate the linear model and the visualizer
#ridge = Ridge()
visualizer = ResidualsPlot(model)
visualizer.fit(X, y_predicted)
print("R^2: {}".format(model.score(X, y_predicted)))
rmse = np.sqrt(mean_squared_error(x,y))
print("Root Mean Squared Error: {}".format(rmse))
visualizer.poof();
plt.show()
print(y_predicted)

R^2: 1.0
Root Mean Squared Error: 40996.17652829138

[ -9.323276   -25.51219289  33.11829672 ... -23.55890017   7.77957917
 -34.09961311]

As can be seen from the above graph and the lasso model below, linear regression is too simple to examine a dataset with over 200 variables. Therefore we must pursue some more difficult model to evaluate the dataset. As can be seen below, after the lasso came out, for the 200 something variables all of them had a supremely low coefficient meaning that they had a very low effect on the final outcome. This is expected as we are certainly oversimplifying here.

X2 = dfClean.fillna(dfClean.mean())
X2 = X2.drop('Stock Name',axis=1)
y2 = dfVAR.values
X_train, X_test, y_train, y_test = train_test_split(X2, y2, test_size=0.33, random_state=42)
#Let's build a Lasso model
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge


alpha_user = 0.95420530945486497 #play with alpha and you will get different results (between 0 and 1)
lasso_model = Lasso(alpha=alpha_user,normalize=True)
#ridge_mode = Ridget(alpha= alpha_user, normalize=True)
lasso_model.fit(X_train,y_train)
lasso_coef = lasso_model.coef_
# print(lasso_coef)
e_dataframe=pd.DataFrame({'a':lasso_coef}) 
e_dataframe = e_dataframe.sort_values(by=['a'],ascending=False)
print(e_dataframe.head(5))
dx = dfClean.columns.values.tolist()
print(dx[230])
print(dx[226])
print(dx[164])

              a
230  232.710423
226  151.960117
164   71.314640
21    69.307591
116    2.472822
Real Estate
Energy
R&D to Revenue

Interestingly enough the top 3 variables were Real Estate, Energy and R&D to Revenue. Traditionally, both Real Estate and Energy are regarded as traditional stocks and therefore their profits have more to do with their stock price unlike the new tech stocks. On the other hand R&D to Revenue's high correlation makes sense as well as this is a good variable to see if a company will gain in profit in the long term.

Logistic Regression

Since, the data from the linear regression showed some positive data regarding the correlation between Sector and change in stock price. I decided to do a logisitic regression study on just sector and class (we can do this because sector is classification data). After doing this, using a confusion matrix, it is visible that although there is a 0.6 accuracy for the model, it is not significant enough to consider a relation betweeen the data.

from sklearn.linear_model import LogisticRegression
X3 = dfSector
y3 = dfClass.values
X_train2, X_test2, y_train2, y_test2 = train_test_split(X3, y3, test_size=0.33, random_state=42)
model = LogisticRegression()

model.fit(X_train2, y_train2)
y_pred2 = model.predict(X_test2)

from sklearn import metrics
print(metrics.accuracy_score(y_test2, y_pred2))


from sklearn.metrics import plot_confusion_matrix 
cnf_matrix = metrics.confusion_matrix(y_test2, y_pred2)
print(cnf_matrix)
metrics.plot_confusion_matrix(model, X_test2, y_test2)
plt.show()
print(metrics.classification_report(y_test2, y_pred2))

0.5974542561654733
[[601 104]
 [402 150]]

              precision    recall  f1-score   support

           0       0.60      0.85      0.70       705
           1       0.59      0.27      0.37       552

    accuracy                           0.60      1257
   macro avg       0.59      0.56      0.54      1257
weighted avg       0.60      0.60      0.56      1257

Let's take a look at what the Sector data really is and how the companies are spread out among different sectors.

import seaborn as sns
df = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
count = df['Sector'].value_counts()
plt.figure(figsize=(15,10))
ax = sns.countplot(x='Sector', data=df, palette="Set2", order=count.index[0:10])
ax.set(xlabel='Sectors', ylabel='Number of Companies')
plt.title("Bar Graph of Sectors")

Text(0.5, 1.0, 'Bar Graph of Sectors')

As can be seen from the bar chart above, most of the companies are related to the financial services industry and we have 11 industries in the data.

The table below shows the yoy stock price change mean, median and variance for all the sectors

# print(result)
import statistics
df = pd.DataFrame(columns=["Sector","Mean","Median","Variance"])
rows = []
df = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
list = df['Sector'].unique()
for i in list:
    dfx = df[df['Sector']==i]
    x = dfx['2015 PRICE VAR [%]'].mean()
    y = dfx['2015 PRICE VAR [%]'].median()
    z = statistics.variance(dfx['2015 PRICE VAR [%]'])
    rows.append([i, x, y, z])    
dfx = pd.DataFrame(rows, columns=["Sector","Mean","Median","Variance"])
print(dfx)
# plt.figure(figsize=(15,25))
# ax = sns.barplot(x='Sector', y='Mean', data=dfx, palette="Set2")
# ax.set(xlabel='Sectors', ylabel='Number of Companies')
# plt.title("Bar Graph of Sectors")

                    Sector          Mean     Median      Variance
0       Consumer Defensive     17.956651   2.761595  3.351567e+04
1          Basic Materials    153.635483 -22.704545  6.715706e+06
2               Healthcare    354.012343  -4.069741  5.234683e+07
3        Consumer Cyclical     47.633625  -6.094964  1.279881e+06
4              Industrials    -12.509181 -13.773614  1.307211e+03
5              Real Estate  10807.848832  -2.276476  2.611423e+10
6   Communication Services     -5.434395  -5.316752  9.448733e+02
7                   Energy    -27.376600 -33.670540  5.846270e+03
8       Financial Services    280.354020   5.464307  3.263070e+07
9                Utilities     -5.473441  -2.485156  9.778282e+02
10              Technology   1480.477975  -4.823197  8.577811e+08

The table below shows the mean, median and variance for class data for the different sectors

# print(result)
import statistics
df = pd.DataFrame(columns=["Sector","Mean","Median","Variance"])
rows = []
df = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
list = df['Sector'].unique()
for i in list:
    dfx = df[df['Sector']==i]
    x = dfx['Class'].mean()
    y = dfx['Class'].median()
    z = statistics.variance(dfx['Class'])
    rows.append([i, x, y, z])    
dfx = pd.DataFrame(rows, columns=["Sector","Mean","Median","Variance"])
print(dfx)
# plt.figure(figsize=(15,25))
# ax = sns.barplot(x='Sector', y='Mean', data=dfx, palette="Set2")
# ax.set(xlabel='Sectors', ylabel='Number of Companies')
# plt.title("Bar Graph of Sectors")

                    Sector      Mean  Median  Variance
0       Consumer Defensive  0.539394     1.0  0.249963
1          Basic Materials  0.219008     0.0  0.171753
2               Healthcare  0.448454     0.0  0.247769
3        Consumer Cyclical  0.404814     0.0  0.241468
4              Industrials  0.329341     0.0  0.221317
5              Real Estate  0.455357     0.0  0.249119
6   Communication Services  0.433735     0.0  0.248604
7                   Energy  0.131222     0.0  0.114521
8       Financial Services  0.624242     1.0  0.234920
9                Utilities  0.432990     0.0  0.248067
10              Technology  0.451389     0.0  0.248068

The graph below shows the relationship between the mean of the stock price change of different industries and the mean of different growth indicators for those industries.

# print(result)
import statistics
from sklearn import datasets, linear_model
df = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
# df = pd.DataFrame(columns=["Sector","VAR Mean","Gross Profit Mean","Net Income Mean","Net Profit Margin Mean"])
df = df.dropna(subset=['Sector','2015 PRICE VAR [%]','priceEarningsToGrowthRatio','Gross Profit Growth','Net Income Growth','Revenue Growth'])
rows = []
list = df['Sector'].unique()
for i in list:
    dfx = df[df['Sector']==i]
    a = dfx['2015 PRICE VAR [%]'].mean()
    b = dfx['priceEarningsToGrowthRatio'].median()
    c = dfx['Gross Profit Growth'].mean()
    d= dfx['Net Income Growth'].mean()
    e= dfx['Revenue Growth'].mean()
    rows.append([i, a, b, c, d, e ])    
dfx = pd.DataFrame(rows, columns=["Sector","VAR Mean","priceEarningsToGrowthRatio","Gross Profit Growth","Net Income Growth","Revenue Growth",])

fig, axs = plt.subplots(2, 2)
x = pd.concat([dfx['VAR Mean']], axis=1)
x2 = np.array(dfx['VAR Mean'])
y = np.array(dfx['Net Income Growth'])
axs[0, 0].set_title('Net Income Growth')
regr = linear_model.LinearRegression()
regr.fit(x, y)
y_pred = regr.predict(x)
# print(regr.intercept_)
# print(regr.coef_)
abline_values = [slope * i + intercept for i in x2]
axs[0, 0].plot(x2, abline_values, 'b')
plt.xlim([-50, 300])
axs[0,0].scatter(x=x2, y =y)

axs[1, 0].set_title('priceEarningsToGrowthRatio')
x = pd.concat([dfx['VAR Mean']], axis=1)
x2 = np.array(dfx['VAR Mean'])
y = np.array(dfx['priceEarningsToGrowthRatio'])
regr = linear_model.LinearRegression()
regr.fit(x, y)
y_pred = regr.predict(x)
# print(regr.intercept_)
# print(regr.coef_)
abline_values = [regr.coef_ * i + regr.intercept_ for i in x2]
axs[1, 0].plot(x2, abline_values, 'b')
axs[1,0].scatter(x=dfx['VAR Mean'], y =dfx['priceEarningsToGrowthRatio'])


axs[1, 1].set_title('Gross Profit Growth')
x = pd.concat([dfx['VAR Mean']], axis=1)
x2 = np.array(dfx['VAR Mean'])
y = np.array(dfx['Gross Profit Growth'])
regr = linear_model.LinearRegression()
regr.fit(x, y)
y_pred = regr.predict(x)
# print(regr.intercept_)
# print(regr.coef_)
abline_values = [regr.coef_ * i + regr.intercept_ for i in x2]
axs[1, 1].plot(x2, abline_values, 'b')
axs[1,1].scatter(x=dfx['VAR Mean'], y =dfx['Gross Profit Growth'])

axs[0, 1].set_title('Revenue Growth')
x = pd.concat([dfx['VAR Mean']], axis=1)
x2 = np.array(dfx['VAR Mean'])
y = np.array(dfx['Revenue Growth'])
regr = linear_model.LinearRegression()
regr.fit(x, y)
y_pred = regr.predict(x)
# print(regr.intercept_)
# print(regr.coef_)
abline_values = [regr.coef_ * i + regr.intercept_ for i in x2]
axs[0, 1].plot(x2, abline_values, 'b')
axs[0,1].scatter(x=dfx['VAR Mean'], y =dfx['Revenue Growth'])
fig. tight_layout(pad=3.0)

Above are some graphs regarding different growth indicators for the different industries and how they differ based on their YOY stock price change. As can be seen above, there doesn't seem to be much of a relation between the points however it does seem that the growth indicators all show similar spread in regards to the points meaning that the growth indicators themselves might have a relation to each other but not in regards to the price YOY change. However, it is interesting to note that each industry seems to have consistent data in regards to its sector, meaning that companies in each indivudal sectors are affected by the sector it is in.

Baseline Dummy Classifier

With this baseline dummy classifier, we know that every model we develop should have an accuracy greater than 0.5879 as that means its worse than a dummy model

from sklearn.dummy import DummyClassifier
dfClass2 = dfClass.replace(1, "1")
dfClass2 = dfClass2.replace(0, "0")
dfClass2 = pd.DataFrame(dfClass2)
X2 = df1.dropna(subset=['Class','2015 PRICE VAR [%]','Stock Name'])
xTrain, xTest, yTrain, yTest = train_test_split(X2, dfClass2, test_size = 0.3, random_state = 2020)
dummy_classifier = DummyClassifier(strategy='most_frequent')
# print(X_train.head(5))
# print(df)
dummy_classifier.fit(xTrain,yTrain)
baseline_acc = dummy_classifier.score(xTest,yTest)


### For verifying answer:
print("Baseline Accuracy (using Class) = ", baseline_acc)

Baseline Accuracy (using Class) =  0.5879265091863517

Decision Tree

X3 = dfClean.fillna(dfClean.mean())
X3.drop('Stock Name',
  axis='columns', inplace=True)
xTrain, xTest, yTrain, yTest = train_test_split(X3, dfClass2, test_size = 0.3, random_state = 2020)
df = pd.DataFrame(columns=["A", "B", "C"])
rows = []
# print(xTrain.head(5))
# print(yTrain.head(5))
for i in range(10):
    if i!=0:
        dt = DecisionTreeClassifier(max_depth=i, criterion="entropy", random_state=2020)
        dt.fit(xTrain,yTrain)
        ypred = dt.predict((xTest))
        ytrainpred = dt.predict((xTrain))
        a1 = metrics.accuracy_score(yTrain, ytrainpred)
        a2 = metrics.accuracy_score(yTest, ypred)
        rows.append([i, a1,a2])
# print(rows)

df = pd.DataFrame(rows, columns=["A", "B","C"])
print(df)

   A         B         C
0  1  0.615009  0.628171
1  2  0.619512  0.608924
2  3  0.643527  0.629921
3  4  0.653659  0.627297
4  5  0.663039  0.628171
5  6  0.689306  0.622922
6  7  0.713321  0.609799
7  8  0.751220  0.606299
8  9  0.774859  0.606299

As we can see from above, at the 5th iteration (depth of 5), the decision tree starts to overfitting the data and create test data with significantly lower accuracy when compared to the training data. Hence we should only analyse the decision tree with a depth of 5 layers.

In the confusion matrix below we can see that the model is good at predicting a company's class (stock price change 0 or 1). However it is not that great at labeling companies with stock price that went up with a significant number of stocks that were classified as false negative.

from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier(max_depth=3, criterion="entropy", random_state=2020)
dt.fit(xTrain,yTrain)
ypred = dt.predict((xTest))
ytrainpred = dt.predict((xTrain))
metrics.plot_confusion_matrix(dt, xTest, yTest)
plt.show()
cf = metrics.confusion_matrix(yTest, ypred)
print(cf)

[[501 171]
 [252 219]]

Accuracy of the train and test data for this model:

print('Accuracy =', metrics.accuracy_score(yTrain, ytrainpred))
print('Accuracy =', metrics.accuracy_score(yTest, ypred))

Accuracy = 0.6435272045028143
Accuracy = 0.6299212598425197

The decision tree below shows that deposit liabilities is the most important factor followed by capitalExpenditureCoverageRatios and Invested Cpaital. This observation is also illustrated in the feature importance table shown below the decision tree.

from sklearn import tree
fn = X3.columns
dfClass3 = dfClass.replace(1, "1")
dfClass3 = dfClass3.replace(0, "0")
cn = dfClass3.unique()
plt.figure(2)
Tree = tree.plot_tree(dt,feature_names=fn,class_names=cn,fontsize=10)
plt.figure(figsize=(100, 500)) 
plt.savefig('stockmarket.png')
plt.show()

<Figure size 7200x36000 with 0 Axes>

print('Feature Importance:', dt.feature_importances_)
imp=pd.DataFrame(zip(xTrain.columns, dt.feature_importances_))
imp.columns =['X','Y']
imp = imp.sort_values(by=['Y'], ascending=False)
pd.set_option('display.max_rows', None)
print(imp.head(5))

Feature Importance: [0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.07005993 0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.32760565 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.07064822 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.15979544 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.07854426
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.20283653 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.09050997 0.         0.
 0.         0.         0.         0.        ]
                                    X         Y
50                Deposit Liabilities  0.327606
176                  Invested Capital  0.202837
128  capitalExpenditureCoverageRatios  0.159795
225                            Energy  0.090510
149      Enterprise Value over EBITDA  0.078544

Interestingly enough, the data we have got is not bad in comparison to the logistic and linear regression data. However because we have so many different variables it is quite difficult to use a decision tree to give us supremely accurate data. Hence we must look at different methods to increase this accuracy.

Ensemble

Ensemble Learning is building a prediction model by combining the strengths of a collection of simpler base models. My ensemble algorithm here will aggregate the predictions made by each individual base model and produce a single output prediction. Hopefully by using ensemble learning we will get better models to find out what factors have the greatest influence on stock price yoy change.

Ensemble Method 1: Bagging
What is Bagging? Bagging (Bootstrap Aggregation) is an averaging method doing different tests on different training samples. Doing this will take advantage of shortcoming of trees (which are heavily influenced by training data) we can ensure we don't overfit like we did in the previous decision tree model we used above.

Method 1a: Generic Bagging:

from sklearn.ensemble import BaggingClassifier
from sklearn.metrics import accuracy_score
dataframe = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
dataframe.rename(columns={'Unnamed: 0':'Stock Name'}, inplace=True)
dataframe2 = dataframe.drop(columns=['Class','2015 PRICE VAR [%]','Stock Name','Sector'])
dataframe = dataframe.drop(columns=['Class','2015 PRICE VAR [%]','Stock Name','Sector'])
ensemble= dataframe.fillna(dataframe.mean())
X_train, X_test, y_train, y_test = train_test_split(ensemble, dfClass,
                                                    test_size=0.3,
                                                    random_state=42
                                                    )
model_bagging = BaggingClassifier(random_state = 42)
model_bagging.fit(X_train, y_train)
pred_bagging = model_bagging.predict(X_test)
scores = []
for model_idx, model in enumerate(model_bagging.estimators_):
    if model_idx == 0:
        print('='*40)
    preds = model.predict(X_test)
    scores.append(recall_score(y_test, preds))
    model_recall = np.round(recall_score(y_test, preds), 5)
    print(f'Recall for Base Model {model_idx+1}:\t', model_recall)
    if model_idx < (len(model_bagging.estimators_) - 1):
        print('-'*40)
    else:
        print('='*40)
ensemble_preds = model_bagging.predict(X_test)
print("Mean Recall Score:\t\t", np.round(np.array(scores).mean(), 5))
randomforest = np.round(np.array(scores).mean(), 5)
print("Std Deviation:\t\t\t", np.round(np.array(scores).std(), 5))
print("Range:\t\t\t\t", np.round(np.array(scores).ptp(), 5))
print(f'Overall Recall for model:\t {np.round(recall_score(y_test, ensemble_preds), 5)}')
# acc_bagging = accuracy_score(y_test, pred_bagging)

# print(' Accuracy = ', acc_bagging)

========================================
Recall for Base Model 1:	 0.52227
----------------------------------------
Recall for Base Model 2:	 0.52632
----------------------------------------
Recall for Base Model 3:	 0.53036
----------------------------------------
Recall for Base Model 4:	 0.51822
----------------------------------------
Recall for Base Model 5:	 0.47773
----------------------------------------
Recall for Base Model 6:	 0.52227
----------------------------------------
Recall for Base Model 7:	 0.51822
----------------------------------------
Recall for Base Model 8:	 0.53239
----------------------------------------
Recall for Base Model 9:	 0.50405
----------------------------------------
Recall for Base Model 10:	 0.48178
========================================
Mean Recall Score:		 0.51336
Std Deviation:			 0.01838
Range:				 0.05466
Overall Recall for model:	 0.41498

Below is a classification report that will give us some data regarding how accurate we are classifying the data in regards to true negatives, true positives, false negatives, and false positives.

from sklearn.metrics import (
    classification_report,
    recall_score,
    precision_score,
    accuracy_score
)
print('Classification Report:\n')
print(classification_report(y_test, pred_bagging))

Classification Report:

              precision    recall  f1-score   support

           0       0.64      0.79      0.71       649
           1       0.60      0.41      0.49       494

    accuracy                           0.63      1143
   macro avg       0.62      0.60      0.60      1143
weighted avg       0.62      0.63      0.61      1143

Method 1b: Random Forest Bagging:

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import recall_score
from sklearn.metrics import precision_score
model_rf = RandomForestClassifier(n_estimators=10, max_features=200, random_state=42)
model_rf.fit(X_train, y_train)
predict_rf = model_rf.predict(X_test)
# recall_rf = recall_score(y_test, predict_rf, average=None)
# precision_rf = precision_score(y_test, predict_rf)
scores = []
for model_idx, model in enumerate(model_rf.estimators_):
    if model_idx == 0:
        print('='*40)
    preds = model.predict(X_test)
    scores.append(recall_score(y_test, preds))
    model_recall = np.round(recall_score(y_test, preds), 5)
    print(f'Recall for Base Model {model_idx+1}:\t', model_recall)
    if model_idx < (len(model_rf.estimators_) - 1):
        print('-'*40)
    else:
        print('='*40)
ensemble_preds = model_rf.predict(X_test)
print("Mean Recall Score:\t\t", np.round(np.array(scores).mean(), 5))
randomforest = np.round(np.array(scores).mean(), 5)
print("Std Deviation:\t\t\t", np.round(np.array(scores).std(), 5))
print("Range:\t\t\t\t", np.round(np.array(scores).ptp(), 5))
print(f'Overall Recall for model:\t {np.round(recall_score(y_test, ensemble_preds), 5)}')

========================================
Recall for Base Model 1:	 0.52024
----------------------------------------
Recall for Base Model 2:	 0.44737
----------------------------------------
Recall for Base Model 3:	 0.53846
----------------------------------------
Recall for Base Model 4:	 0.48381
----------------------------------------
Recall for Base Model 5:	 0.49798
----------------------------------------
Recall for Base Model 6:	 0.50405
----------------------------------------
Recall for Base Model 7:	 0.4919
----------------------------------------
Recall for Base Model 8:	 0.49595
----------------------------------------
Recall for Base Model 9:	 0.46964
----------------------------------------
Recall for Base Model 10:	 0.50202
========================================
Mean Recall Score:		 0.49514
Std Deviation:			 0.0239
Range:				 0.09109
Overall Recall for model:	 0.40081

As can be seen from above, both the generic and decision tree based bagging models were both quite inaccurate. This make sense as we already know that the training data is not the problem that is giving us a low accuracy model. It is that we have too many variables that do not have a significant effect on the final output at the end. Let's now look at other type of ensemble methods.

To clean up the data, I have used an out of bag evaluation to see what error are we getting from using bagging and using certain data multiple times while leaving other data unused. As can be seen below the OOB technique cleans up around 2% of the data. However this is relatively unimportant as the bagging model is not accurate enough to be continued to be considered in the rest of this study.

model_rf_oob = RandomForestClassifier(n_estimators=100, max_features=7, oob_score=True, random_state=42).fit(X_train, y_train)
oob_score = round(model_rf_oob.oob_score_,4)
acc_oob = round(accuracy_score(y_test, model_rf_oob.predict(X_test)),4)
diff_oob = round(abs(oob_score - acc_oob),4)

print('OOB Score:\t\t\t', oob_score)
print('Testing Accuracy:\t\t', acc_oob)
print('Acc. Difference:\t\t', diff_oob)

OOB Score:			 0.6379
Testing Accuracy:		 0.6553
Acc. Difference:		 0.0174

Ensemble Method 2: AdaBoost
AdaBoost is a boosting algorithm that begins by assigning equal weights to all observations but based on the first set of predictions it starts to adjust the weights to increase for misclassified observations and reduced for correctly classified observations. This update keeps on going until we have a relatively accurate model.

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
base_est = DecisionTreeClassifier (max_depth =2)
ada_boost1 = AdaBoostClassifier(base_est, n_estimators=500, random_state=42, learning_rate=.15)
ada_boost1.fit(X_train, y_train)
ada_boost2 = AdaBoostClassifier(base_est, n_estimators=20, random_state=42, learning_rate=.5)
ada_boost2.fit(X_train, y_train)
res1 = round(recall_score(y_test, ada_boost1.predict(X_test)),4)
res2 = round(recall_score(y_test, ada_boost2.predict(X_test)),4)

print('Winning Model:\t {}, MODEL 1')
print('MODEL 1 Recall:\t {}'.format(res1))
print('MODEL 2 Recall:\t {}:'.format(res2))

Winning Model:	 {}, MODEL 1
MODEL 1 Recall:	 0.4352
MODEL 2 Recall:	 0.4352:

Ensemble Method 3: Gradient Boosted Trees (GBT)
Gradient Booosted Trees is another boosting algorithm but unlike Adaboost it re-weighs observations based on the prediction performance through residuals errors of the preceding models.

from sklearn.ensemble import GradientBoostingClassifier
dataframe2 = dataframe.dropna().
xTrain, xTest, yTrain, yTest = train_test_split(X3, dfClass2, test_size = 0.3, random_state = 2020)
gbc = GradientBoostingClassifier(random_state=42).fit(X_train, y_train)
gbc_pred = gbc.predict(X_test)
recall_gbc = round(recall_score(y_test, gbc_pred),4)
precision_gbc = round(precision_score(y_test, gbc_pred),4)

print('Recall :\t {}'.format(recall_gbc))
print('Precision :\t {}:'.format(precision_gbc))

Recall :	 0.4291
Precision :	 0.6405:
Recall :	 0.4291
Precision :	 0.6405:

As shown above, both the boosting algorithms have relatively high precision but in terms of recall it doesn't do quite well. Below I will be using a hard voting method to combine the logistic regression and the randomforest methods to create a combined algorithm.

Ensemble Method 4: Voting Classifier
In hard voting, we use majority voting to combine results from all the models and choose the outputs (0,1) for all values. In this model we use an ensemble of RandomForestClassifier, Support Vector Machine and Logistic Regression. At the end we get a combined accuracy score of 0.65 which is quite good compared to the other models we have used before.

from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression

rfClf = RandomForestClassifier(n_estimators=500, random_state=0) # 500 trees. 
svmClf = SVC(probability=True, random_state=0) # probability calculation
logClf = LogisticRegression(random_state=0)

# constructing the ensemble classifier by mentioning the individual classifiers.
clf2 = VotingClassifier(estimators = [('rf',rfClf), ('svm',svmClf), ('log', logClf)], voting='hard') 

# train the ensemble classifier
clf2.fit(X_train, y_train)

clf2_pred = clf2.predict(X_test)
recall_voting = recall_score(y_test, clf2_pred)
precision_voting = precision_score(y_test, clf2_pred)
print('Accuracy score', accuracy_score(y_test, clf2_pred))

# You can use the individual classifiers to get the accuracy in the beginning and see if our ensemble performs
# better when compared to individual classifiers.

/opt/anaconda3/lib/python3.8/site-packages/sklearn/linear_model/_logistic.py:762: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(

Accuracy score 0.6535433070866141

Ensemble Method 5: XGBoost
XGBoost is a boosting method that uses the gradient boosting (GBM) framework and is well known to provide better solutions than other machine learning algorithms.

import xgboost as xgb
from sklearn.metrics import mean_squared_error
import pandas as pd
import numpy as np

dataframe = pd.read_csv('2014_Financial_Data.csv', delimiter=',')
dataframe.rename(columns={'Unnamed: 0':'Stock Name'}, inplace=True)
dataframe = dataframe.drop(columns=['Class','2015 PRICE VAR [%]','Stock Name','Sector'])
ensemble= dataframe.fillna(dataframe.mean())
X = ensemble
y = dfClass
data_dmatrix = xgb.DMatrix(data=X,label=y)
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=123)
xg_reg = xgb.XGBRegressor(objective ='reg:squarederror', colsample_bytree = 0.3, learning_rate = 0.1,
                max_depth = 5, alpha = 10, n_estimators = 100)
xg_reg.fit(X_train,y_train)

preds = xg_reg.predict(X_test)
rmse = np.sqrt(mean_squared_error(y_test, preds))
print("RMSE: %f" % (rmse))

RMSE: 0.469583

/opt/anaconda3/lib/python3.8/site-packages/xgboost/data.py:112: UserWarning: Use subset (sliced data) of np.ndarray is not recommended because it will generate extra copies and increase memory consumption
  warnings.warn(

from sklearn.linear_model import LinearRegression

linear_reg_model = LinearRegression()
linear_reg_model.fit(X_train,y_train)

pred_train = linear_reg_model.predict(X_train)
train_rmse = np.sqrt(mean_squared_error(y_train,pred_train))
print("Train RMSE: %f" % (train_rmse))

pred_test = linear_reg_model.predict(X_test)
test_rmse = np.sqrt(mean_squared_error(y_test,pred_test))
print("Test RMSE: %f" % (test_rmse))

Train RMSE: 0.467616
Test RMSE: 8.410056

params = {"objective":"reg:squarederror",'colsample_bytree': 0.3,'learning_rate': 0.1,
                'max_depth': 5, 'alpha': 10}

cv_results = xgb.cv(dtrain=data_dmatrix, params=params, nfold=3,
                    num_boost_round=50,early_stopping_rounds=10,metrics="rmse", as_pandas=True, seed=123)

cv_results.head()

print((cv_results["test-rmse-mean"]).tail(1))

49    0.468364
Name: test-rmse-mean, dtype: float64

xg_reg = xgb.train(params=params, dtrain=data_dmatrix, num_boost_round=10)

import matplotlib.pyplot as plt

xgb.plot_tree(xg_reg,num_trees=1)
plt.rcParams['figure.figsize'] = [500, 500]
plt.show()

As seen from the calculations above, XGBoost allows us to create a tree based on the most important factors. From the test rmse mean, it is clear that the amount of error is quite low especially in comparison to the RMSE of a linear regression. Now that we have a decent model, what does this model tell us about the data? Below I will be looking through features that are important in the model and how they play a role in affecting YOY stock price change.

from xgboost import XGBRegressor
xgb = XGBRegressor(n_estimators=100)
xgb.fit(X_train, y_train)
# plt.barh(dataframe.columns, xgb.feature_importances_)
feature_names = dataframe.columns
sorted_idx = xgb.feature_importances_.argsort()
# print(sorted_idx)
# print(feature_names)
# print(xgb.feature_importances_)  

rows=[]
for i in sorted_idx:
    x = feature_names[i]
    y = xgb.feature_importances_[i]
    rows.append([i, x, y])  
dfx = pd.DataFrame(rows, columns=["id","Variable","Value"])
pd.set_option('display.max_columns', None)
print(dfx.tail(10))

eid = dfx['Variable'].tail(15).to_numpy()

# plt.barh(feature_names[sorted_idx][:-5], xgb.feature_importances_[sorted_idx][:-5])
# plt.xlabel("Xgboost Feature Importance")

      id                          Variable     Value
211  193                EPS Diluted Growth  0.009422
212  158                 Interest Coverage  0.010057
213   76                  priceToBookRatio  0.010354
214   74                Net Cash/Marketcap  0.011450
215  174              Tangible Asset Value  0.017536
216  142                          PE ratio  0.017872
217  157                     Current ratio  0.017959
218  128  capitalExpenditureCoverageRatios  0.018837
219  169                     Graham Number  0.019915
220   50               Deposit Liabilities  0.065098

As can be seen above from the feature importance variables, Deposit Liabilities is the most important factor and most of the other variables have a significantly lower influence (with many having a value of 0.0). This is an interesting point because in an earlier study regarding the random forest model (including Sectors), Deposit liabilities was also the top variable when we looked at the feature importance for that model.

import seaborn as sns
def correlation_heatmap(train):
    correlations = train.corr()

    fig, ax = plt.subplots(figsize=(10,10))
    sns.heatmap(correlations, vmax=1.0, center=0, fmt='.2f', cmap="YlGnBu",
                square=True, linewidths=.5, annot=True, cbar_kws={"shrink": .70}
                )
    plt.show();
    
correlation_heatmap(X_train[eid])

The heatmap above shows the top 15 most important factors and how they relate to one another. It is intersting that most of them have no relation to one another. However, Deposit Liabilities does have a 0.92 correlation with Tangible Asset Value. This is interesting as asset and liabilities are the two most fundamental statistics that a company has as it shows what the company is currently worth. Maybe there is some reasoning why both deposit liabilities and tangible asset value have something to do with the stock price change, as a company with a good balance sheet may seem safer to investors and become more desirable leading to a increase in the stock price of a company.

Conclusion
Although, this study doesn't really give us any concrete evidence of any relation between a company's financial factors and its stock price change, it gives us a better understanding of what factors have a higher than normal correlation with stock price change.
After using data science models ranging from linear regression and logisitic regression to decision trees and ensemble methods such as XGBoost, my study has found that Sector and Deposit liabilities have the greatest influence on stock price change. Many of the other factors have a close to 0 relation with stock price change. Most of the stocks that belong in one sector, seem to have similar financials and similar growth. This makes sense as this is why PE comparisons are often made betweeen companies in the same industry. In regards to deposit liabilities my theory is that because investors are loss aversive, they focus a lot of attention on the companies' liabilities to the bank as it represents the company's independence from giant financial investors such as banks.
The most interesting question that this project has allowed me to understand is why stock prices are not correlated with the respective companies financials. Through my experience is an avid investor and through doing this study, I have realised that a company's price on the stock market is not determined by financials and crunched by numbers, it is instead determined by the judgements of investors. And as long term thinkers, investors don't invest based on current financial numbers but instead on future numbers. Investors may instead focus their attention on politics, macroeconomics and long term trends that have longstanding effects on the future of specific industries/sectors.
The reason why the finance industry is so challenging is that there is no formula to allow for profit in the stock market. There is no guarantee for success and there is no investment without some sort of risk. This project has allowed me to have a better understanding of this concept and I hope to be able to continue doing this sort of data science project on the stock market and see if my hypothesis regarding the effects behavioral psychology and macro global trends have on stock price change is correct.

	Unnamed: 0	Revenue	Revenue Growth	Cost of Revenue	Gross Profit	R&D Expenses	SG&A Expense	Operating Expenses	Operating Income	Interest Expense	Earnings before Tax	Income Tax Expense	Net Income - Non-Controlling int	Net Income - Discontinued ops	Net Income	Preferred Dividends	Net Income Com	EPS	EPS Diluted	Weighted Average Shs Out	Weighted Average Shs Out (Dil)	Dividend per Share	Gross Margin	EBITDA Margin	EBIT Margin	Profit Margin	Free Cash Flow margin	EBITDA	EBIT	Consolidated Income	Earnings Before Tax Margin	Net Profit Margin	Cash and cash equivalents	Short-term investments	Cash and short-term investments	Receivables	Inventories	Total current assets	Property, Plant & Equipment Net	Goodwill and Intangible Assets	Long-term investments	Tax assets	Total non-current assets	Total assets	Payables	Short-term debt	Total current liabilities	Long-term debt	Total debt	Deferred revenue	Tax Liabilities	Total non-current liabilities	Total liabilities	Other comprehensive income	Retained earnings (deficit)	Total shareholders equity	Investments	Net Debt	Other Assets	Other Liabilities	Depreciation & Amortization	Stock-based compensation	Operating Cash Flow	Capital Expenditure	Acquisitions and disposals	Investment purchases and sales	Investing Cash flow	Issuance (repayment) of debt	Issuance (buybacks) of shares	Dividend payments	Financing Cash Flow	Effect of forex changes on cash	Net cash flow / Change in cash	Free Cash Flow	Net Cash/Marketcap	priceBookValueRatio	priceToBookRatio	priceToSalesRatio	priceEarningsRatio	priceToFreeCashFlowsRatio	priceToOperatingCashFlowsRatio	priceCashFlowRatio	priceEarningsToGrowthRatio	priceSalesRatio	dividendYield	enterpriseValueMultiple	priceFairValue	ebitperRevenue	ebtperEBIT	niperEBT	grossProfitMargin	operatingProfitMargin	pretaxProfitMargin	netProfitMargin	effectiveTaxRate	returnOnAssets	returnOnEquity	returnOnCapitalEmployed	nIperEBT	eBTperEBIT	eBITperRevenue	payablesTurnover	inventoryTurnover	fixedAssetTurnover	assetTurnover	currentRatio	quickRatio	cashRatio	daysOfSalesOutstanding	daysOfInventoryOutstanding	operatingCycle	daysOfPayablesOutstanding	cashConversionCycle	debtRatio	debtEquityRatio	longtermDebtToCapitalization	totalDebtToCapitalization	interestCoverage	cashFlowToDebtRatio	companyEquityMultiplier	operatingCashFlowPerShare	freeCashFlowPerShare	cashPerShare	payoutRatio	operatingCashFlowSalesRatio	freeCashFlowOperatingCashFlowRatio	cashFlowCoverageRatios	shortTermCoverageRatios	capitalExpenditureCoverageRatios	dividendpaidAndCapexCoverageRatios	dividendPayoutRatio	Revenue per Share	Net Income per Share	Operating Cash Flow per Share	Free Cash Flow per Share	Cash per Share	Book Value per Share	Tangible Book Value per Share	Shareholders Equity per Share	Interest Debt per Share	Market Cap	Enterprise Value	PE ratio	Price to Sales Ratio	POCF ratio	PFCF ratio	PB ratio	PTB ratio	EV to Sales	Enterprise Value over EBITDA	EV to Operating cash flow	EV to Free cash flow	Earnings Yield	Free Cash Flow Yield	Debt to Equity	Debt to Assets	Net Debt to EBITDA	Current ratio	Interest Coverage	Income Quality	Dividend Yield	Payout Ratio	SG&A to Revenue	R&D to Revenue	Intangibles to Total Assets	Capex to Operating Cash Flow	Capex to Revenue	Capex to Depreciation	Stock-based compensation to Revenue	Graham Number	ROIC	Return on Tangible Assets	Graham Net-Net	Working Capital	Tangible Asset Value	Net Current Asset Value	Invested Capital	Average Receivables	Average Payables	Average Inventory	Days Sales Outstanding	Days Payables Outstanding	Days of Inventory on Hand	Receivables Turnover	Payables Turnover	Inventory Turnover	ROE	Capex per Share	Gross Profit Growth	EBIT Growth	Operating Income Growth	Net Income Growth	EPS Growth	EPS Diluted Growth	Weighted Average Shares Growth	Weighted Average Shares Diluted Growth	Dividends per Share Growth	Operating Cash Flow growth	Free Cash Flow growth	10Y Revenue Growth (per Share)	5Y Revenue Growth (per Share)	3Y Revenue Growth (per Share)	10Y Operating CF Growth (per Share)	5Y Operating CF Growth (per Share)	3Y Operating CF Growth (per Share)	10Y Net Income Growth (per Share)	5Y Net Income Growth (per Share)	3Y Net Income Growth (per Share)	10Y Shareholders Equity Growth (per Share)	5Y Shareholders Equity Growth (per Share)	3Y Shareholders Equity Growth (per Share)	10Y Dividend per Share Growth (per Share)	5Y Dividend per Share Growth (per Share)	3Y Dividend per Share Growth (per Share)	Receivables growth	Inventory Growth	Asset Growth	Book Value per Share Growth	Debt Growth	R&D Expense Growth	SG&A Expenses Growth	Sector	2015 PRICE VAR [%]	Class
0	PG	7.440100e+10	-0.0713	3.903000e+10	3.537100e+10	0.000000e+00	2.146100e+10	2.146100e+10	1.391000e+10	7.090000e+08	1.449400e+10	2.851000e+09	1.420000e+08	-1.127000e+09	1.164300e+10	0.0	1.164300e+10	4.1900	4.010	2.705960e+09	2.705960e+09	2.448	0.4754	0.2470	0.2043	0.1560	0.1359	1.834400e+10	1.520300e+10	1.178500e+10	0.1948	0.1565	8.558000e+09	2.128000e+09	1.068600e+10	6.386000e+09	6.759000e+09	3.161700e+10	2.230400e+10	8.454700e+10	0.000000e+00	1.092000e+09	1.126490e+11	1.442660e+11	8.461000e+09	1.560600e+10	3.372600e+10	1.981100e+10	3.541700e+10	0.000000e+00	1.021800e+10	4.056400e+10	7.429000e+10	-7.662000e+09	8.499000e+10	6.997600e+10	2.128000e+09	2.473100e+10	7.786000e+09	9.659000e+09	3.141000e+09	3.600000e+08	1.395800e+10	-3.848000e+09	-2.400000e+07	-8.050000e+08	-4.100000e+09	3.543000e+09	-3.911000e+09	-6.911000e+09	-7.279000e+09	3.900000e+07	2.618000e+09	1.011000e+10	-0.1163	3.0391	0.0000	2.8583	18.7566	21.0348	15.2358	12.977501	15.557842	2.434644	0.038153	11.421116	2.588601	0.204339	0.953364	0.803298	0.475410	1.0	0.186960	0.156490	0.196702	0.5765	0.1664	0.0753	0.803298	0.953364	0.204339	1.5648	10.8869	3.335769	0.515721	0.9370	0.506197	0.253751	-63.2087	31.3287	NaN	41.5084	NaN	0.2455	0.5061	0.220644	0.336047	21.4429	0.394105	2.061650	5.1582	3.964	3.1626	0.584	0.187605	0.724316	0.394105	0.894400	3.627339	1.297332	0.593576	27.4952	4.3027	5.1582	3.964	3.1626	25.8600	22.0690	25.8600	13.0885	2.126614e+11	2.373924e+11	18.7566	2.8583	15.2358	21.0348	3.0391	0.0000	3.1907	12.9411	17.0076	23.4810	0.0533	0.0475	0.5061	0.2455	1.3482	0.9370	21.4429	1.1988	0.0311	0.584	0.2885	0.000	0.5860	0.2757	0.0517	-1.2251	0.0048	49.3756	0.0753	0.5765	-0.2007	-2.109000e+09	5.971900e+10	-4.267300e+10	5.285200e+10	6.447000e+09	8.619000e+09	6.834000e+09	31.3287	41.5084	-63.2087	11.5404	1.5648	10.8869	0.1664	-1.4220	-0.1185	-0.0001	0.0067	0.0293	0.0371	0.0389	-0.0127	NaN	0.0699	-0.0615	-0.0695	0.0323	0.0088	-0.0182	0.0355	0.0015	0.0260	0.0603	-0.0137	0.0060	0.1442	0.0353	0.0201	0.1013	0.0834	0.0751	-0.0187	-0.0217	0.0359	0.0316	0.1228	0.0000	-0.1746	Consumer Defensive	-9.323276	0
1	VIPS	3.734148e+09	1.1737	2.805625e+09	9.285226e+08	1.083303e+08	3.441414e+08	7.939267e+08	1.345959e+08	1.214869e+07	1.753823e+08	3.955957e+07	-1.431918e+07	0.000000e+00	1.358227e+08	0.0	1.358227e+08	0.2396	0.226	NaN	1.829362e+07	0.000	0.2487	0.0107	0.0502	0.0058	0.0704	2.456858e+08	1.875310e+08	1.215036e+08	0.0470	0.0364	7.735149e+08	6.083852e+08	1.381900e+09	1.169487e+07	5.793193e+08	2.134397e+09	3.085975e+08	1.906587e+08	6.299354e+07	3.764110e+07	6.022904e+08	2.736687e+09	1.000491e+09	0.000000e+00	1.639537e+09	6.223741e+08	6.223741e+08	3.141104e+07	0.000000e+00	6.615567e+08	2.301093e+09	-1.729254e+06	4.285438e+06	4.123545e+08	6.713788e+08	NaN	NaN	6.390460e+08	5.815483e+07	3.640523e+07	5.267456e+08	-2.637046e+08	-1.922552e+08	-2.126447e+08	-6.866936e+08	6.193268e+08	1.767840e+06	0.000000e+00	6.219136e+08	-1.564869e+07	4.463169e+08	2.630410e+08	0.0956	NaN	NaN	0.0443	81.5526	1.3589	NaN	0.678615	2.631793	0.095727	-0.000000	7.136538	0.866869	0.050221	0.935218	0.774438	0.248657	1.0	0.036045	0.036373	0.225562	0.0403	0.3294	0.0000	0.774438	0.935218	0.050221	NaN	NaN	12.100383	1.364477	0.2102	0.849993	0.471789	-75.3670	1.1431	NaN	97.7945	NaN	0.2274	1.5093	0.601485	0.601485	15.4363	0.846349	6.636734	35.6714	22.976	52.3827	0.000	0.141062	0.499370	0.846349	NaN	1.997484	1.997484	-0.000000	252.8778	9.1980	35.6714	22.976	52.3827	0.7278	4.4939	27.9248	42.1474	6.253200e+09	NaN	81.5526	0.0443	NaN	1.3589	NaN	NaN	NaN	NaN	NaN	NaN	0.0123	NaN	1.5093	0.2274	NaN	0.2102	15.4363	3.8782	0.0000	0.000	0.0922	0.029	0.0697	0.5006	0.0706	-4.5345	0.0097	NaN	0.0000	0.0403	0.0534	3.065163e+09	2.546029e+09	-1.666966e+08	7.553510e+08	NaN	NaN	NaN	1.1431	97.7945	-75.3670	NaN	NaN	NaN	0.3294	-17.8582	1.2489	1.6142	1.9463	1.5649	1.4625	1.4646	0.0184	0.0195	0.0000	0.2134	-0.3626	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	1.6484	1.7313	Consumer Defensive	-25.512193	0
2	KR	9.837500e+10	0.0182	7.813800e+10	2.023700e+10	0.000000e+00	1.519600e+10	1.751200e+10	2.725000e+09	4.430000e+08	2.270000e+09	7.510000e+08	1.200000e+07	0.000000e+00	1.519000e+09	0.0	1.519000e+09	1.4700	1.450	1.033053e+09	1.028000e+09	0.307	0.2057	0.0450	0.0276	0.0150	0.0126	4.416000e+09	2.713000e+09	1.531000e+09	0.0231	0.0154	4.010000e+08	0.000000e+00	4.010000e+08	1.116000e+09	5.651000e+09	8.830000e+09	1.689300e+10	2.837000e+09	0.000000e+00	0.000000e+00	2.045100e+10	2.928100e+10	4.881000e+09	1.657000e+09	1.070500e+10	9.653000e+09	1.131000e+10	0.000000e+00	1.629000e+09	1.318100e+10	2.388600e+10	-4.640000e+08	1.098100e+10	5.384000e+09	0.000000e+00	1.090900e+10	1.662000e+09	4.167000e+09	1.703000e+09	1.070000e+08	3.573000e+09	-2.330000e+09	-2.344000e+09	0.000000e+00	-4.771000e+09	2.093000e+09	-4.130000e+08	-3.190000e+08	1.361000e+09	0.000000e+00	1.630000e+08	1.243000e+09	-0.5969	3.3579	7.0982	0.1858	12.0340	14.6302	5.1147	0.000000	0.000000	0.000000	NaN	2.276721	0.000000	0.027578	0.836712	0.669163	0.205713	1.0	0.027700	0.015441	0.330837	0.1011	0.2821	0.0859	0.669163	0.836712	0.027578	4.9593	18.2227	5.823418	3.359687	0.8250	0.141709	0.037459	-26.3971	4.1407	NaN	18.1099	NaN	0.3863	2.1007	0.641950	0.677489	6.1242	0.315915	5.438522	3.4757	0.837	0.3901	0.209	0.036320	0.347887	0.315915	2.156307	1.533476	1.348811	0.210007	95.6955	1.4776	3.4757	0.837	0.3901	5.2370	25.7240	5.2374	11.0019	1.827470e+10	2.918370e+10	12.0340	0.1858	5.1147	14.6302	3.3579	7.0982	0.2967	6.6086	8.1678	23.4784	0.0831	0.0680	2.1007	0.3863	2.4703	0.8250	6.1242	2.3522	0.0174	0.209	0.1545	0.000	0.0969	0.6521	0.0237	-1.3682	0.0011	13.2000	0.0859	0.1011	-0.8239	-1.875000e+09	2.644400e+10	-1.505600e+10	2.664800e+10	1.083500e+09	4.682500e+09	5.398500e+09	4.1407	18.1099	-26.3971	90.7937	4.9593	18.2227	0.2821	-2.2665	0.0173	-0.0145	-0.0141	0.0147	0.0576	0.0469	-0.0356	-0.0317	0.2429	0.2095	0.3935	0.1027	0.1038	0.1399	0.0889	0.0937	0.0946	0.2272	0.0906	0.1892	0.0676	0.0558	0.0789	0.0000	0.1215	0.1633	0.0618	0.0981	0.1886	0.3268	0.2738	0.0000	0.0234	Consumer Defensive	33.118297	1
3	RAD	2.552641e+10	0.0053	1.820268e+10	7.323734e+09	0.000000e+00	6.561162e+09	6.586482e+09	7.372520e+08	4.245910e+08	2.502180e+08	8.040000e+05	0.000000e+00	0.000000e+00	2.494140e+08	33998000.0	2.154160e+08	4.6000	4.600	4.832628e+07	4.832628e+07	0.000	0.2869	0.0420	0.0264	0.0080	0.0144	1.078550e+09	6.748090e+08	2.494140e+08	0.0098	0.0098	1.464060e+08	0.000000e+00	1.464060e+08	9.490620e+08	2.993948e+09	4.285125e+09	1.957329e+09	4.312270e+08	0.000000e+00	0.000000e+00	2.659746e+09	6.944871e+09	1.292419e+09	4.917400e+07	2.507452e+09	5.707969e+09	5.757143e+09	0.000000e+00	0.000000e+00	6.551121e+09	9.058573e+09	-3.733400e+07	-7.515848e+09	-2.113702e+09	0.000000e+00	5.610737e+09	1.957090e+08	1.165859e+09	4.037410e+08	-1.047100e+07	7.020460e+08	-3.338700e+08	0.000000e+00	4.115500e+07	-3.649240e+08	-2.954350e+08	3.321700e+07	0.000000e+00	-3.201680e+08	0.000000e+00	1.695400e+07	3.681760e+08	-0.8822	0.0000	0.0000	0.2491	28.6087	17.2736	9.0589	0.000000	0.000000	0.000000	NaN	5.428005	NaN	0.026436	0.370798	0.996787	0.286908	1.0	0.028882	0.009771	0.003213	0.0668	-0.1180	0.1062	0.996787	0.370798	0.026436	4.9289	8.3030	13.041452	3.675578	1.7090	0.436885	0.058388	-60.0346	13.5706	NaN	18.4802	NaN	0.8290	-2.7237	1.588076	1.580139	1.5893	0.121943	NaN	14.5272	0.557	3.0295	0.000	0.027503	0.524433	0.121943	14.276772	2.102753	2.102753	-0.000000	528.2098	5.1610	14.5272	0.557	3.0295	-43.7380	134.7850	-43.7381	119.1307	6.359738e+09	1.197048e+10	28.6087	0.2491	9.0589	17.2736	0.0000	0.0000	0.4689	11.0987	17.0508	32.5129	0.0350	0.0579	-2.7237	0.8290	5.2021	1.7090	1.5893	3.2590	0.0000	0.000	0.2570	0.000	0.0621	0.4756	0.0131	-0.8269	-0.0004	0.0000	0.1062	0.0668	-0.7506	1.777673e+09	6.513644e+09	-4.773448e+09	9.616929e+09	9.392690e+08	1.338532e+09	3.074345e+09	13.5706	18.4802	-60.0346	27.1769	4.9289	8.3030	-0.1180	-6.9087	0.0007	0.2904	0.1113	1.0043	0.9167	0.9167	0.0694	NaN	0.0000	-0.1434	-0.2691	-0.0195	-0.0290	-0.0230	0.0513	0.1164	0.1777	0.0480	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0211	-0.0510	-0.0189	0.1963	-0.0458	0.0000	-0.0060	Consumer Defensive	2.752291	1
4	GIS	1.790960e+10	0.0076	1.153980e+10	6.369800e+09	0.000000e+00	3.474300e+09	3.412400e+09	2.957400e+09	3.024000e+08	2.707700e+09	8.833000e+08	3.690000e+07	0.000000e+00	1.824400e+09	0.0	1.824400e+09	2.9000	2.830	6.144922e+08	6.144922e+08	1.550	0.3557	0.2010	0.1681	0.1020	0.1052	3.595500e+09	3.010100e+09	1.861300e+09	0.1512	0.1019	8.673000e+08	0.000000e+00	8.673000e+08	1.483600e+09	1.559400e+09	4.393500e+09	3.941900e+09	1.366480e+10	0.000000e+00	7.410000e+07	1.875220e+10	2.314570e+10	1.611300e+09	2.362300e+09	5.423500e+09	6.423500e+09	8.785800e+09	0.000000e+00	1.666000e+09	9.732700e+09	1.515620e+10	-1.340300e+09	1.178720e+10	6.534800e+09	0.000000e+00	7.918500e+09	4.832000e+08	1.449900e+09	5.854000e+08	3.920000e+07	2.541000e+09	-6.569000e+08	6.670000e+07	2.930000e+07	-5.618000e+08	8.011000e+08	-1.637200e+09	-9.833000e+08	-1.824100e+09	-2.920000e+07	1.259000e+08	1.884100e+09	-0.2376	5.1004	0.0000	1.8610	18.7034	17.6902	13.1169	10.952350	15.254287	1.553911	0.035332	9.767966	4.258726	0.168072	0.899538	0.673782	0.355664	1.0	0.165129	0.101867	0.326218	0.6265	0.2792	0.1041	0.673782	0.899538	0.168072	2.8234	11.5363	4.543393	0.773777	0.8100	0.433465	0.159915	-49.3233	30.2360	NaN	32.8385	NaN	0.3796	1.3445	0.495705	0.573463	9.9540	0.289217	3.541914	4.1351	3.624	1.4114	0.534	0.141879	0.741480	0.289217	1.075647	3.868169	1.549201	0.538972	29.1454	2.9690	4.1351	3.624	1.4114	10.6340	15.4290	10.6345	14.2977	3.333006e+10	4.124856e+10	18.7034	1.8610	13.1169	17.6902	5.1004	0.0000	2.3032	11.4723	16.2332	21.8930	0.0535	0.0565	1.3445	0.3796	2.2023	0.8100	9.9540	1.3928	0.0286	0.534	0.1940	0.000	0.5904	0.2585	0.0367	-1.1221	0.0022	26.3414	0.1041	0.6265	-0.3229	-1.030000e+09	9.480900e+09	-1.076270e+10	1.197590e+10	1.465000e+09	1.517250e+09	1.552450e+09	30.2360	32.8385	-49.3233	12.2250	2.8234	11.5363	0.2792	-1.0690	-0.0084	0.0332	0.0370	-0.0166	0.0140	0.0143	-0.0468	NaN	0.1742	-0.1316	-0.1936	0.0704	0.0567	0.0773	0.0782	0.0828	0.1991	0.0776	0.0841	0.0177	0.0427	0.0623	0.0217	0.1092	0.1250	0.1144	0.0257	0.0090	0.0215	0.0274	0.1025	0.0000	-0.0220	Consumer Defensive	12.897715	1

	Stock Name	Revenue	Revenue Growth	Cost of Revenue	Gross Profit	R&D Expenses	SG&A Expense	Operating Expenses	Operating Income	Interest Expense	Earnings before Tax	Income Tax Expense	Net Income - Non-Controlling int	Net Income - Discontinued ops	Net Income	Preferred Dividends	Net Income Com	EPS	EPS Diluted	Weighted Average Shs Out	Weighted Average Shs Out (Dil)	Dividend per Share	Gross Margin	EBITDA Margin	EBIT Margin	Profit Margin	Free Cash Flow margin	EBITDA	EBIT	Consolidated Income	Earnings Before Tax Margin	Net Profit Margin	Cash and cash equivalents	Short-term investments	Cash and short-term investments	Receivables	Inventories	Total current assets	Property, Plant & Equipment Net	Goodwill and Intangible Assets	Long-term investments	Tax assets	Total non-current assets	Total assets	Payables	Short-term debt	Total current liabilities	Long-term debt	Total debt	Deferred revenue	Tax Liabilities	Deposit Liabilities	Total non-current liabilities	Total liabilities	Other comprehensive income	Retained earnings (deficit)	Total shareholders equity	Investments	Net Debt	Other Assets	Other Liabilities	Depreciation & Amortization	Stock-based compensation	Operating Cash Flow	Capital Expenditure	Acquisitions and disposals	Investment purchases and sales	Investing Cash flow	Issuance (repayment) of debt	Issuance (buybacks) of shares	Dividend payments	Financing Cash Flow	Effect of forex changes on cash	Net cash flow / Change in cash	Free Cash Flow	Net Cash/Marketcap	priceBookValueRatio	priceToBookRatio	priceToSalesRatio	priceEarningsRatio	priceToFreeCashFlowsRatio	priceToOperatingCashFlowsRatio	priceCashFlowRatio	priceEarningsToGrowthRatio	priceSalesRatio	dividendYield	enterpriseValueMultiple	priceFairValue	ebitperRevenue	ebtperEBIT	niperEBT	grossProfitMargin	operatingProfitMargin	pretaxProfitMargin	netProfitMargin	effectiveTaxRate	returnOnAssets	returnOnEquity	returnOnCapitalEmployed	nIperEBT	eBTperEBIT	eBITperRevenue	payablesTurnover	inventoryTurnover	fixedAssetTurnover	assetTurnover	currentRatio	quickRatio	cashRatio	daysOfSalesOutstanding	daysOfInventoryOutstanding	operatingCycle	daysOfPayablesOutstanding	cashConversionCycle	debtRatio	debtEquityRatio	longtermDebtToCapitalization	totalDebtToCapitalization	interestCoverage	cashFlowToDebtRatio	companyEquityMultiplier	operatingCashFlowPerShare	freeCashFlowPerShare	cashPerShare	payoutRatio	operatingCashFlowSalesRatio	freeCashFlowOperatingCashFlowRatio	cashFlowCoverageRatios	shortTermCoverageRatios	capitalExpenditureCoverageRatios	dividendpaidAndCapexCoverageRatios	dividendPayoutRatio	Revenue per Share	Net Income per Share	Operating Cash Flow per Share	Free Cash Flow per Share	Cash per Share	Book Value per Share	Tangible Book Value per Share	Shareholders Equity per Share	Interest Debt per Share	Market Cap	Enterprise Value	PE ratio	Price to Sales Ratio	POCF ratio	PFCF ratio	PB ratio	PTB ratio	EV to Sales	Enterprise Value over EBITDA	EV to Operating cash flow	EV to Free cash flow	Earnings Yield	Free Cash Flow Yield	Debt to Equity	Debt to Assets	Net Debt to EBITDA	Current ratio	Interest Coverage	Income Quality	Dividend Yield	Payout Ratio	SG&A to Revenue	R&D to Revenue	Intangibles to Total Assets	Capex to Operating Cash Flow	Capex to Revenue	Capex to Depreciation	Stock-based compensation to Revenue	Graham Number	ROIC	Return on Tangible Assets	Graham Net-Net	Working Capital	Tangible Asset Value	Net Current Asset Value	Invested Capital	Average Receivables	Average Payables	Average Inventory	Days Sales Outstanding	Days Payables Outstanding	Days of Inventory on Hand	Receivables Turnover	Payables Turnover	Inventory Turnover	ROE	Capex per Share	Gross Profit Growth	EBIT Growth	Operating Income Growth	Net Income Growth	EPS Growth	EPS Diluted Growth	Weighted Average Shares Growth	Weighted Average Shares Diluted Growth	Dividends per Share Growth	Operating Cash Flow growth	Free Cash Flow growth	10Y Revenue Growth (per Share)	5Y Revenue Growth (per Share)	3Y Revenue Growth (per Share)	10Y Operating CF Growth (per Share)	5Y Operating CF Growth (per Share)	3Y Operating CF Growth (per Share)	10Y Net Income Growth (per Share)	5Y Net Income Growth (per Share)	3Y Net Income Growth (per Share)	10Y Shareholders Equity Growth (per Share)	5Y Shareholders Equity Growth (per Share)	3Y Shareholders Equity Growth (per Share)	10Y Dividend per Share Growth (per Share)	5Y Dividend per Share Growth (per Share)	3Y Dividend per Share Growth (per Share)	Receivables growth	Inventory Growth	Asset Growth	Book Value per Share Growth	Debt Growth	R&D Expense Growth	SG&A Expenses Growth	Sector	2015 PRICE VAR [%]	Class
column type	object	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	float64	object	float64	int64
null values (nb)	0	44	236	74	52	136	59	63	55	63	80	66	149	149	23	149	15	72	84	236	110	149	47	202	50	201	159	64	4	23	47	57	58	933	930	59	115	871	88	35	912	133	908	33	90	898	879	854	78	136	131	183	911	39	73	164	33	124	1277	1274	830	43	68	40	52	124	22	25	61	108	142	11	83	15	47	349	659	659	181	181	181	338	1053	1296	396	688	944	428	346	1118	1243	346	346	346	346	1243	995	192	987	1243	1118	346	781	306	345	173	967	972	972	182	181	3807	181	3807	185	185	282	269	172	1596	376	230	229	232	207	346	1016	1596	1956	421	293	1296	227	227	218	217	232	231	231	232	231	298	654	154	154	315	154	644	644	654	654	654	654	154	315	185	185	1424	988	144	156	154	186	156	162	173	144	156	158	165	644	987	995	315	987	173	958	185	288	288	288	182	182	183	306	781	306	193	218	231	231	231	243	282	293	336	1049	237	232	238	1547	1037	733	1536	1031	726	1547	1037	733	1585	1064	796	1465	952	600	290	290	290	369	302	247	243	0	0	0
null values (%)	0	1.15546	6.19748	1.94328	1.36555	3.57143	1.54937	1.65441	1.44433	1.65441	2.10084	1.73319	3.91282	3.91282	0.603992	3.91282	0.393908	1.89076	2.20588	6.19748	2.88866	3.91282	1.23424	5.30462	1.31303	5.27836	4.17542	1.68067	0.105042	0.603992	1.23424	1.49685	1.52311	24.5011	24.4223	1.54937	3.01996	22.8729	2.31092	0.919118	23.9496	3.49265	23.8445	0.866597	2.36345	23.5819	23.083	22.4265	2.04832	3.57143	3.44013	4.80567	23.9233	1.02416	1.91702	4.30672	0.866597	3.2563	33.5347	33.4559	21.7962	1.1292	1.78571	1.05042	1.36555	3.2563	0.577731	0.656513	1.60189	2.83613	3.72899	0.288866	2.17962	0.393908	1.23424	9.16492	17.3057	17.3057	4.75315	4.75315	4.75315	8.87605	27.6523	34.0336	10.3992	18.0672	24.7899	11.2395	9.08613	29.3592	32.6418	9.08613	9.08613	9.08613	9.08613	32.6418	26.1292	5.04202	25.9191	32.6418	29.3592	9.08613	20.5095	8.03571	9.05987	4.54307	25.3939	25.5252	25.5252	4.77941	4.75315	99.9737	4.75315	99.9737	4.85819	4.85819	7.40546	7.06408	4.51681	41.9118	9.87395	6.03992	6.01366	6.09244	5.43592	9.08613	26.6807	41.9118	51.3655	11.0557	7.69433	34.0336	5.96113	5.96113	5.72479	5.69853	6.09244	6.06618	6.06618	6.09244	6.06618	7.82563	17.1744	4.04412	4.04412	8.27206	4.04412	16.9118	16.9118	17.1744	17.1744	17.1744	17.1744	4.04412	8.27206	4.85819	4.85819	37.395	25.9454	3.78151	4.09664	4.04412	4.88445	4.09664	4.2542	4.54307	3.78151	4.09664	4.14916	4.33298	16.9118	25.9191	26.1292	8.27206	25.9191	4.54307	25.1576	4.85819	7.56303	7.56303	7.56303	4.77941	4.77941	4.80567	8.03571	20.5095	8.03571	5.06828	5.72479	6.06618	6.06618	6.06618	6.3813	7.40546	7.69433	8.82353	27.5473	6.22374	6.09244	6.25	40.625	27.2321	19.2489	40.3361	27.0746	19.0651	40.625	27.2321	19.2489	41.6229	27.9412	20.9034	38.4716	25	15.7563	7.61555	7.61555	7.61555	9.69013	7.93067	6.48634	6.3813	0	0	0

	Revenue	Revenue Growth	Cost of Revenue	Gross Profit	R&D Expenses	SG&A Expense	Operating Expenses	Operating Income	Interest Expense	Earnings before Tax	Income Tax Expense	Net Income - Non-Controlling int	Net Income - Discontinued ops	Net Income	Preferred Dividends	Net Income Com	EPS	EPS Diluted	Weighted Average Shs Out	Weighted Average Shs Out (Dil)	Dividend per Share	Gross Margin	EBITDA Margin	EBIT Margin	Profit Margin	Free Cash Flow margin	EBITDA	EBIT	Consolidated Income	Earnings Before Tax Margin	Net Profit Margin	Cash and cash equivalents	Short-term investments	Cash and short-term investments	Receivables	Inventories	Total current assets	Property, Plant & Equipment Net	Goodwill and Intangible Assets	Long-term investments	Tax assets	Total non-current assets	Total assets	Payables	Short-term debt	Total current liabilities	Long-term debt	Total debt	Deferred revenue	Tax Liabilities	Deposit Liabilities	Total non-current liabilities	Total liabilities	Other comprehensive income	Retained earnings (deficit)	Total shareholders equity	Investments	Net Debt	Other Assets	Other Liabilities	Depreciation & Amortization	Stock-based compensation	Operating Cash Flow	Capital Expenditure	Acquisitions and disposals	Investment purchases and sales	Investing Cash flow	Issuance (repayment) of debt	Issuance (buybacks) of shares	Dividend payments	Financing Cash Flow	Effect of forex changes on cash	Net cash flow / Change in cash	Free Cash Flow	Net Cash/Marketcap	priceBookValueRatio	priceToBookRatio	priceToSalesRatio	priceEarningsRatio	priceToFreeCashFlowsRatio	priceToOperatingCashFlowsRatio	priceCashFlowRatio	priceEarningsToGrowthRatio	priceSalesRatio	dividendYield	enterpriseValueMultiple	priceFairValue	ebitperRevenue	ebtperEBIT	niperEBT	grossProfitMargin	operatingProfitMargin	pretaxProfitMargin	netProfitMargin	effectiveTaxRate	returnOnAssets	returnOnEquity	returnOnCapitalEmployed	nIperEBT	eBTperEBIT	eBITperRevenue	payablesTurnover	inventoryTurnover	fixedAssetTurnover	assetTurnover	currentRatio	quickRatio	cashRatio	daysOfSalesOutstanding	daysOfInventoryOutstanding	operatingCycle	daysOfPayablesOutstanding	cashConversionCycle	debtRatio	debtEquityRatio	longtermDebtToCapitalization	totalDebtToCapitalization	interestCoverage	cashFlowToDebtRatio	companyEquityMultiplier	operatingCashFlowPerShare	freeCashFlowPerShare	cashPerShare	payoutRatio	operatingCashFlowSalesRatio	freeCashFlowOperatingCashFlowRatio	cashFlowCoverageRatios	shortTermCoverageRatios	capitalExpenditureCoverageRatios	dividendpaidAndCapexCoverageRatios	dividendPayoutRatio	Revenue per Share	Net Income per Share	Operating Cash Flow per Share	Free Cash Flow per Share	Cash per Share	Book Value per Share	Tangible Book Value per Share	Shareholders Equity per Share	Interest Debt per Share	Market Cap	Enterprise Value	PE ratio	Price to Sales Ratio	POCF ratio	PFCF ratio	PB ratio	PTB ratio	EV to Sales	Enterprise Value over EBITDA	EV to Operating cash flow	EV to Free cash flow	Earnings Yield	Free Cash Flow Yield	Debt to Equity	Debt to Assets	Net Debt to EBITDA	Current ratio	Interest Coverage	Income Quality	Dividend Yield	Payout Ratio	SG&A to Revenue	R&D to Revenue	Intangibles to Total Assets	Capex to Operating Cash Flow	Capex to Revenue	Capex to Depreciation	Stock-based compensation to Revenue	Graham Number	ROIC	Return on Tangible Assets	Graham Net-Net	Working Capital	Tangible Asset Value	Net Current Asset Value	Invested Capital	Average Receivables	Average Payables	Average Inventory	Days Sales Outstanding	Days Payables Outstanding	Days of Inventory on Hand	Receivables Turnover	Payables Turnover	Inventory Turnover	ROE	Capex per Share	Gross Profit Growth	EBIT Growth	Operating Income Growth	Net Income Growth	EPS Growth	EPS Diluted Growth	Weighted Average Shares Growth	Weighted Average Shares Diluted Growth	Dividends per Share Growth	Operating Cash Flow growth	Free Cash Flow growth	10Y Revenue Growth (per Share)	5Y Revenue Growth (per Share)	3Y Revenue Growth (per Share)	10Y Operating CF Growth (per Share)	5Y Operating CF Growth (per Share)	3Y Operating CF Growth (per Share)	10Y Net Income Growth (per Share)	5Y Net Income Growth (per Share)	3Y Net Income Growth (per Share)	10Y Shareholders Equity Growth (per Share)	5Y Shareholders Equity Growth (per Share)	3Y Shareholders Equity Growth (per Share)	10Y Dividend per Share Growth (per Share)	5Y Dividend per Share Growth (per Share)	3Y Dividend per Share Growth (per Share)	Receivables growth	Inventory Growth	Asset Growth	Book Value per Share Growth	Debt Growth	R&D Expense Growth	SG&A Expenses Growth	2015 PRICE VAR [%]	Class
count	3.764000e+03	3572.000000	3.734000e+03	3.756000e+03	3.672000e+03	3.749000e+03	3.745000e+03	3.753000e+03	3.745000e+03	3.728000e+03	3.742000e+03	3.659000e+03	3.659000e+03	3.785000e+03	3.659000e+03	3.793000e+03	3.736000e+03	3.724000e+03	3.572000e+03	3.698000e+03	3659.000000	3761.000000	3606.000000	3758.000000	3607.000000	3649.000000	3.744000e+03	3.804000e+03	3.785000e+03	3761.000000	3751.000000	3.750000e+03	2.875000e+03	2.878000e+03	3.749000e+03	3.693000e+03	2.937000e+03	3.720000e+03	3.773000e+03	2.896000e+03	3.675000e+03	2.900000e+03	3.775000e+03	3.718000e+03	2.910000e+03	2.929000e+03	2.954000e+03	3.730000e+03	3.672000e+03	3.677000e+03	3.625000e+03	2.897000e+03	3.769000e+03	3.735000e+03	3.644000e+03	3.775000e+03	3.684000e+03	2.531000e+03	2.534000e+03	2.978000e+03	3.765000e+03	3.740000e+03	3.768000e+03	3.756000e+03	3.684000e+03	3.786000e+03	3.783000e+03	3.747000e+03	3.700000e+03	3.666000e+03	3.797000e+03	3.725000e+03	3.793000e+03	3.761000e+03	3459.000000	3.149000e+03	3.149000e+03	3627.000000	3627.000000	3627.000000	3470.000000	2755.000000	2512.000000	3412.000000	3120.000000	2864.000000	3380.000000	3462.000000	2690.000000	2565.000000	3462.000000	3462.0	3462.000000	3462.000000	2565.000000	2813.000000	3.616000e+03	2821.000000	2565.000000	2690.000000	3462.000000	3027.000000	3502.000000	3463.000000	3635.000000	2841.000000	2836.000000	2836.000000	3626.000000	3627.000000	1.0000	3627.000000	1.0000	3623.000000	3623.000000	3526.000000	3539.000000	3636.000000	2212.000000	3.432000e+03	3.578000e+03	3.579000e+03	3.576000e+03	3601.000000	3462.000000	2792.000000	2212.000000	1852.000000	3387.000000	3515.000000	2512.000000	3.581000e+03	3.581000e+03	3.590000e+03	3.591000e+03	3.576000e+03	3.577000e+03	3.577000e+03	3.576000e+03	3.577000e+03	3.510000e+03	3.154000e+03	3654.000000	3654.000000	3493.000000	3654.000000	3.164000e+03	3.164000e+03	3154.000000	3154.000000	3154.000000	3154.000000	3654.000000	3493.000000	3623.000000	3623.000000	2384.000000	2820.000000	3664.000000	3652.000000	3654.000000	3622.000000	3652.000000	3646.000000	3635.000000	3664.000000	3652.000000	3650.000000	3643.000000	3164.000000	2821.000000	2813.000000	3493.000000	2.821000e+03	3.635000e+03	2.850000e+03	3.623000e+03	3.520000e+03	3.520000e+03	3.520000e+03	3626.000000	3626.000000	3625.000000	3502.000000	3027.000000	3502.000000	3.615000e+03	3.590000e+03	3577.000000	3577.000000	3577.000000	3565.000000	3526.000000	3515.000000	3472.000000	2759.000000	3571.000000	3576.000000	3570.000000	2261.000000	2771.000000	3075.000000	2272.000000	2777.000000	3082.000000	2261.000000	2771.000000	3075.000000	2223.000000	2744.000000	3012.000000	2343.000000	2856.000000	3208.000000	3518.000000	3518.000000	3518.000000	3439.000000	3506.000000	3561.000000	3565.000000	3.808000e+03	3808.000000
mean	5.879050e+09	12.954244	3.700973e+09	2.188214e+09	9.401830e+07	9.307406e+08	1.438144e+09	6.748248e+08	1.002018e+08	5.725866e+08	1.762750e+08	1.557750e+07	-3.453915e+06	4.894335e+08	5.343177e+06	4.839166e+08	-2.607365e+04	-2.653476e+04	2.865321e+08	2.904988e+08	3.413530	0.469036	-4.612215	-3.736484	-5.102894	-3.229727	1.041692e+09	7.782098e+08	5.051208e+08	-4.039696	-3.784724	1.615347e+09	2.612956e+08	8.496238e+08	9.615113e+08	4.866494e+08	3.111922e+09	2.941115e+09	1.664754e+09	4.245599e+08	1.849196e+08	5.816330e+09	2.642981e+10	1.024153e+09	4.672686e+08	2.071357e+09	3.857223e+09	5.994973e+09	1.228135e+08	3.984022e+08	5.179785e+09	3.449642e+09	2.208640e+10	-9.030215e+07	2.206764e+09	4.124379e+09	1.001118e+10	3.440958e+09	5.282998e+08	7.502376e+08	3.290080e+08	2.479993e+07	1.191508e+09	-4.886058e+08	-8.583320e+07	-2.183940e+08	-8.250133e+08	6.200769e+07	-8.792079e+07	-1.763699e+08	4.896244e+07	-4.174801e+06	9.939070e+07	4.453619e+08	-2.314196	2.790536e+04	2.790766e+04	45.129427	28.882535	32.866710	38.854715	22.758312	33.880458	38.046475	0.247325	37.229890	5.203304	-4.149450	0.679354	0.743828	0.428221	1.0	-4.982938	-4.333599	0.258234	0.073683	1.941046e+03	-0.430209	0.743828	0.679354	-4.149450	5.164910	27.891355	149.363526	0.728869	6.840022	6.304907	1.823634	-56.050637	87.649135	134.5455	232.431425	93.0392	0.246147	1.247731	0.268564	0.289505	-20.656176	391.603619	8.358932e+02	1.377534e+04	7.113342e+02	4.169637e+04	0.361147	13.367974	-0.345192	391.603619	566.729529	-2.855130	-6.278449	0.591802	1.342362e+05	5.323582e+03	1.372931e+04	7.089520e+02	4.169637e+04	2.271467e+05	3.941552e+05	1.484680e+05	2.090793e+05	3.503639e+10	7.994689e+09	29.168906	44.972676	38.660934	32.749071	2.777308e+04	2.777536e+04	42.642911	27.151004	27.509114	14.089356	-6.636770	-7.574803	1.247731	0.246147	-3.957762	6.872239	-20.130356	2.436330	0.019972	0.359600	2.003619	2.086115	0.142232	1.062227	0.224709	-0.452700	0.700113	396.354959	-0.430209	0.073683	-2.955065	5.915030e+09	1.756294e+10	-2.266889e+09	1.934120e+10	9.426405e+08	8.949436e+08	3.726066e+08	87.654608	232.491967	-56.066099	48.114703	5.164910	27.891355	1.941584e+03	-1.516578e+04	95.342800	3.631871	3.855939	-0.426786	0.164305	0.140394	1.824807	2.192586	0.160613	-1.341505	-23.892636	0.031811	0.043491	0.046010	0.044151	0.057456	0.067409	0.038966	0.069918	0.088356	0.045218	0.057705	0.065999	-0.019197	0.006783	0.036206	4.445793	0.164692	2.279999	0.813931	1.361780	0.596529	0.382284	9.751560e+02	0.429097
std	3.901741e+10	705.605473	3.040688e+10	1.159028e+10	6.408912e+08	4.820928e+09	7.305171e+09	3.786291e+09	6.991091e+08	2.793740e+09	1.011342e+09	1.620427e+08	2.360399e+08	4.396692e+09	6.760107e+07	4.392279e+09	1.672636e+06	1.673673e+06	2.304823e+09	2.403028e+09	167.005657	1.061432	82.473116	63.214487	90.142740	54.458512	6.057032e+09	5.329003e+09	4.415837e+09	64.600341	63.300580	1.494661e+10	2.125871e+09	3.855421e+09	6.066000e+09	6.848526e+09	3.033776e+10	2.119840e+10	9.983609e+09	3.811519e+09	1.564932e+09	2.702205e+10	3.570764e+11	9.989482e+09	5.767364e+09	1.901577e+10	8.263929e+10	7.958489e+10	9.394575e+08	2.126550e+09	6.039611e+10	2.168603e+10	3.254540e+11	1.335849e+09	1.419948e+10	2.972280e+10	9.501288e+10	8.948312e+10	1.433403e+10	5.502288e+09	2.401693e+09	1.535355e+08	1.683247e+10	3.741773e+09	7.662339e+08	5.212931e+09	6.572521e+09	3.697576e+09	2.686317e+09	7.754103e+08	9.871214e+09	9.388622e+08	4.426819e+09	4.060265e+09	91.010023	1.565693e+06	1.565693e+06	634.798219	169.220834	277.566797	1072.187989	122.713659	193.948036	660.011982	7.912518	883.203941	48.955775	66.153186	3.659063	3.232350	3.961384	0.0	85.478456	66.736850	3.232208	9.590453	1.167203e+05	17.743103	3.232350	3.659063	66.153186	20.830615	271.747359	7590.193566	0.936109	122.604424	122.705345	6.159653	197.928846	816.891148	NaN	3655.903472	NaN	0.527067	36.500177	1.195338	1.048213	1727.077299	18353.502069	4.867024e+04	3.995943e+05	1.921795e+05	1.199459e+06	4.741215	984.117469	12.929755	18353.502069	20101.549721	1163.836896	1137.993942	3.142069	4.129140e+06	1.825286e+05	3.989265e+05	1.918581e+05	1.199459e+06	1.072112e+07	1.892736e+07	3.411571e+06	5.622659e+06	1.623730e+12	2.808666e+10	169.445219	632.513736	1068.654944	276.548915	1.561977e+06	1.561977e+06	628.930733	630.194602	627.245999	955.723634	354.498211	438.287961	36.500177	0.527067	202.169311	123.059397	1720.612738	55.011342	0.118780	4.727528	34.124975	39.510150	0.197968	11.326515	1.449700	658.360878	21.442018	12755.983156	17.743103	9.590453	98.712298	2.440567e+11	1.213901e+11	9.859086e+09	1.376875e+11	6.025057e+09	6.990105e+09	1.861781e+09	817.003748	3656.405881	197.953962	1833.381818	20.830615	271.747359	1.167364e+05	3.855947e+05	5630.833201	228.011155	227.116093	29.049961	5.920824	5.934814	67.636280	73.075979	1.799747	74.436352	1398.408015	0.158958	0.205229	0.420155	0.117730	0.224307	0.292005	0.115010	0.208890	0.470741	0.127331	0.243623	0.343162	0.229890	0.229982	0.259765	161.135098	2.195115	92.428920	25.630214	18.734301	25.894049	4.696975	4.098461e+04	0.495012
min	-6.276160e+08	-1.773200	-5.455740e+08	-1.105000e+09	-1.500000e+05	0.000000e+00	-1.088448e+09	-6.786000e+09	-2.250000e+08	-8.878000e+09	-2.081114e+09	-1.587227e+09	-7.015198e+09	-8.360000e+09	-1.262800e+07	-8.360000e+09	-1.018709e+08	-1.018709e+08	0.000000e+00	0.000000e+00	0.000000	-53.176500	-3053.200000	-3053.200000	-3216.080000	-2831.181800	-6.629500e+09	-8.465000e+09	-8.311600e+09	-3053.200000	-3053.200000	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	-1.932850e+10	-1.017445e+09	-2.030685e+10	-6.440654e+09	-7.458100e+09	0.000000e+00	-1.770249e+09	0.000000e+00	-8.970561e+09	-2.736842e+09	-2.787500e+10	-8.557072e+10	-1.262900e+10	-7.697817e+06	-6.306400e+10	-1.345100e+10	-4.880630e+10	-8.327900e+07	-1.370000e+08	-3.175545e+10	-1.455850e+11	-1.932000e+10	-1.678170e+11	-1.656360e+11	-1.963530e+11	-4.427000e+10	-1.156800e+10	-1.385010e+11	-4.300000e+10	-5.672351e+10	-1.042394e+11	-4943.552900	0.000000e+00	0.000000e+00	0.000000	0.000000	0.000000	-4967.751300	0.000000	0.000000	0.000000	-0.000000	0.000000	0.000000	-3053.200000	-150.163636	-84.906977	-223.482759	1.0	-3399.545455	-3053.200000	-98.750000	-278.863200	-7.101360e+01	-616.644600	-84.906977	-150.163636	-3053.200000	-41.095800	0.000000	0.000000	-0.008685	-1.439300	0.000000	0.000000	-8547.090800	-21650.054300	134.5455	-3741.250000	93.0392	-0.382600	-263.458300	-1.228723	-1.765524	-75776.000000	-1801.548711	6.086275e-01	-4.875219e+06	-3.619346e+06	0.000000e+00	-86.000000	-2825.636364	-595.823529	-1801.548711	-1556.100000	-16221.787529	-16221.787529	-0.000000	-3.167700e+00	-4.687614e+06	-4.875219e+06	-3.619346e+06	0.000000e+00	-2.397800e+05	0.000000e+00	-2.397800e+05	-1.139670e+01	0.000000e+00	-1.251747e+11	0.000000	0.000000	-4967.751300	0.000000	0.000000e+00	0.000000e+00	-271.564000	0.000000	-3356.751300	-18641.285600	-21375.000000	-25896.708100	-263.458300	-0.382600	-9616.000000	-1.439300	-75776.000000	-724.812500	0.000000	-86.000000	-1.293500	-0.731700	0.000000	0.000000	-52.684100	-16522.069800	-1.446200	0.000000	-616.644600	-278.863200	-5474.440600	-4.780213e+12	-2.101300e+09	-1.894410e+11	-1.240000e+09	0.000000e+00	-1.890953e+10	0.000000e+00	-21650.054300	-3741.250000	-8547.090800	-1.085600	-41.095800	0.000000	-7.101360e+01	-1.402752e+07	-48.793100	-1119.284200	-487.340900	-1110.006500	-85.007900	-85.007900	-0.998900	-0.849400	-1.000000	-4028.397700	-83431.428600	-1.000000	-1.000000	-1.000000	-0.761800	-0.738700	-0.965700	-0.774000	-0.726700	-0.968000	-0.530300	-0.722100	-0.846400	-1.000000	-1.000000	-1.000000	-1.000000	-1.000000	-0.979600	-230.000000	-1.051000	-1.043700	-1.000000	-1.003972e+02	0.000000
25%	5.788880e+07	-0.002350	3.135714e+06	3.092900e+07	0.000000e+00	1.548900e+07	3.106600e+07	-1.308000e+06	0.000000e+00	-3.732902e+06	0.000000e+00	0.000000e+00	0.000000e+00	-3.693000e+06	0.000000e+00	-4.369000e+06	-2.300000e-01	-2.400000e-01	1.554453e+07	1.379943e+07	0.000000	0.248700	0.036024	0.000000	-0.010000	-0.046700	3.646812e+06	-5.915000e+05	-3.728599e+06	0.000000	-0.001700	1.812575e+07	0.000000e+00	1.961225e+07	2.096328e+06	0.000000e+00	6.216000e+07	8.009750e+06	0.000000e+00	0.000000e+00	0.000000e+00	3.524285e+07	1.907368e+08	2.547500e+06	0.000000e+00	1.963000e+07	0.000000e+00	4.339782e+06	0.000000e+00	0.000000e+00	0.000000e+00	8.666000e+06	5.913900e+07	-1.128200e+07	-8.273925e+07	7.353440e+07	0.000000e+00	-4.946685e+07	1.549000e+06	5.764758e+06	1.836000e+06	2.815000e+05	1.281011e+06	-1.370000e+08	-1.200000e+07	-1.307400e+07	-3.257935e+08	-3.354167e+06	-5.642000e+06	-4.788225e+07	-5.000000e+07	-6.330000e+05	-1.721703e+07	-1.141200e+07	-0.374200	1.122400e+00	8.095000e-01	0.807750	0.000000	0.000000	2.096275	4.859322	9.356526	0.559122	-0.000000	6.846566	0.851143	0.018165	0.728538	0.632823	0.280132	1.0	0.012158	-0.006901	0.149406	-0.099300	-2.085000e-02	0.000000	0.632823	0.728538	0.018165	0.853150	0.000000	1.136849	0.123647	1.221000	0.819859	0.212494	-68.622050	10.176250	134.5455	10.526700	93.0392	0.022550	0.004700	0.000000	0.000000	0.000000	0.073443	1.535180e+00	7.842500e-02	-5.538500e-01	7.598500e-01	0.000000	0.026483	0.192590	0.073443	0.466218	0.300805	0.249213	-0.000000	3.805900e+00	-2.330000e-01	7.690000e-02	-5.600000e-01	7.598500e-01	4.001000e+00	7.861000e+00	4.083250e+00	2.708000e-01	1.993537e+08	2.255034e+08	0.000000	0.807675	2.060900	0.000000	1.120175e+00	8.035000e-01	0.963650	4.224900	4.093225	-7.007325	-0.021500	-0.042700	0.004700	0.022550	-0.547400	1.223500	0.000000	0.484625	0.000000	0.000000	0.092750	0.000000	0.000000	0.000000	0.012600	-1.672250	0.000800	0.000000	0.000000	-0.099300	-0.799800	1.387900e+07	1.636885e+08	-7.747000e+08	8.476944e+07	2.382503e+06	2.591102e+06	0.000000e+00	10.162075	10.524750	-68.627300	2.965125	0.853150	0.000000	-2.075000e-02	-1.807475e+00	-0.010200	-0.269800	-0.254500	-0.377500	-0.290775	-0.287900	-0.004425	-0.001300	0.000000	-0.328700	-0.640250	-0.006300	-0.010950	-0.023350	0.000000	0.000000	-0.014800	0.000000	0.000000	0.000000	0.000000	-0.006700	-0.008425	0.000000	0.000000	0.000000	-0.045800	0.000000	-0.015400	-0.084550	-0.048200	0.000000	-0.003900	-2.857143e+01	0.000000
50%	4.349010e+08	0.061850	1.414420e+08	1.908760e+08	0.000000e+00	7.381800e+07	1.386610e+08	4.104400e+07	2.563000e+06	2.842700e+07	5.335000e+06	0.000000e+00	0.000000e+00	2.223100e+07	0.000000e+00	2.101400e+07	7.700000e-01	7.500000e-01	4.268193e+07	3.976000e+07	0.000000	0.449900	0.137000	0.093100	0.051000	0.034200	7.404300e+07	4.256200e+07	2.361500e+07	0.068800	0.051200	7.255750e+07	0.000000e+00	8.438850e+07	4.240000e+07	1.758930e+06	2.753700e+08	9.104600e+07	2.829200e+07	0.000000e+00	4.600000e+05	3.484420e+08	1.089263e+09	2.510567e+07	2.798000e+06	1.218020e+08	7.396000e+07	1.729705e+08	0.000000e+00	7.740000e+05	0.000000e+00	1.516530e+08	5.886190e+08	-7.780000e+03	2.981150e+07	3.454000e+08	1.573300e+07	4.745816e+06	1.427000e+07	4.296150e+07	1.800000e+07	2.782500e+06	5.287800e+07	-1.705250e+07	0.000000e+00	0.000000e+00	-5.437000e+07	0.000000e+00	0.000000e+00	0.000000e+00	1.309673e+06	0.000000e+00	4.000000e+05	1.226500e+07	-0.048400	1.964600e+00	1.842500e+00	1.879500	14.684200	8.993800	9.710550	9.824162	16.209362	1.544053	0.002560	10.553933	1.593240	0.100009	0.921088	0.702852	0.475392	1.0	0.095568	0.056318	0.297302	0.055800	7.795000e-02	0.042500	0.702852	0.921088	0.100009	2.706000	3.376100	4.109010	0.501582	2.005000	1.427704	0.562938	-4.873550	45.165800	134.5455	26.460500	93.0392	0.171100	0.389300	0.051922	0.125816	1.320500	0.259684	2.227554e+00	1.625200e+00	5.510000e-01	2.259450e+00	0.000000	0.113410	0.662218	0.259684	2.618434	1.862784	1.299883	0.224380	1.149060e+01	8.300000e-01	1.622650e+00	5.500000e-01	2.259450e+00	1.041900e+01	2.150700e+01	1.100090e+01	5.733600e+00	9.149070e+08	1.148674e+09	14.682450	1.879550	9.706400	8.993750	1.964250e+00	1.836600e+00	2.277600	10.082300	12.616150	13.193850	0.036200	0.029800	0.389300	0.171100	0.956500	2.008500	1.302800	1.257550	0.000000	0.000000	0.222850	0.000000	0.036500	0.194600	0.033550	-0.917450	0.006000	13.585800	0.042500	0.055800	-0.125600	9.741000e+07	8.500508e+08	-6.846379e+06	7.837860e+08	4.316150e+07	2.508725e+07	1.808224e+06	45.133100	26.462950	-4.877000	6.171250	2.706000	3.376100	7.800000e-02	-4.848500e-01	0.072200	0.065500	0.061000	0.059100	0.096300	0.097800	0.008300	0.012600	0.000000	0.007350	-0.088200	0.042900	0.052700	0.043800	0.016000	0.002000	0.000000	0.000000	0.000000	0.000000	0.046200	0.043200	0.042950	0.000000	0.000000	0.000000	0.014250	0.000000	0.062200	0.037100	0.000000	0.000000	0.068800	-5.281842e+00	0.000000
75%	2.393625e+09	0.188875	1.199844e+09	8.922534e+08	9.911000e+06	3.510000e+08	5.885100e+08	2.712980e+08	4.300000e+07	2.193835e+08	5.752550e+07	0.000000e+00	0.000000e+00	1.658440e+08	0.000000e+00	1.630900e+08	2.030000e+00	2.000000e+00	1.171910e+08	1.130076e+08	0.654500	0.782500	0.304000	0.219300	0.138000	0.135900	4.049525e+08	2.817180e+08	1.755630e+08	0.177800	0.140350	2.858825e+08	1.691150e+07	3.300598e+08	2.958000e+08	1.075810e+08	1.198922e+09	8.360218e+08	4.100000e+08	2.258025e+07	2.551300e+07	2.392991e+09	4.924418e+09	1.814340e+08	4.149516e+07	6.301590e+08	1.007628e+09	1.358073e+09	4.876605e+06	6.119400e+07	0.000000e+00	1.455163e+09	3.161640e+09	0.000000e+00	5.604522e+08	1.535963e+09	4.053261e+08	5.141640e+08	8.318125e+07	2.550547e+08	1.127670e+08	1.149100e+07	3.145500e+08	-1.378638e+06	0.000000e+00	0.000000e+00	-3.586548e+06	7.470400e+07	8.425000e+06	0.000000e+00	7.006200e+07	0.000000e+00	3.000000e+07	1.429070e+08	0.113000	3.785000e+00	4.527100e+00	4.198300	23.979100	21.474300	16.406175	16.230272	26.079822	3.319632	0.027486	15.507845	3.255962	0.226624	1.000000	0.850666	0.800911	1.0	0.218920	0.145664	0.366562	0.169700	1.531500e-01	0.101400	0.850666	1.000000	0.226624	5.164250	10.844675	10.389072	1.000714	3.475000	2.758397	1.456602	0.000000	72.019750	134.5455	54.000100	93.0392	0.369450	1.019850	0.387503	0.432501	7.933625	0.606874	3.927946e+00	3.957725e+00	2.163000e+00	5.986325e+00	0.357000	0.264997	0.874133	0.606874	11.023445	5.212105	2.761710	0.535769	3.238150e+01	2.267300e+00	3.954550e+00	2.161600e+00	5.986325e+00	2.011940e+01	5.666600e+01	2.136085e+01	1.763870e+01	3.926217e+09	4.832950e+09	23.996825	4.195425	16.410100	21.476150	3.784175e+00	4.525150e+00	5.710050	15.400700	20.253600	25.989700	0.060675	0.067700	1.019850	0.369450	2.779175	3.476250	7.896275	2.108600	0.021200	0.357000	0.461800	0.036200	0.223750	0.599075	0.092300	-0.449300	0.019600	28.514800	0.101400	0.169700	0.103800	4.185280e+08	3.896543e+09	5.106348e+07	4.159895e+09	2.908209e+08	1.750891e+08	9.941438e+07	72.028075	54.006150	0.000000	10.110675	5.164250	10.844675	1.532000e-01	-9.665000e-02	0.225700	0.406100	0.353700	0.452900	0.519900	0.528400	0.086800	0.122150	0.055050	0.341450	0.435700	0.094100	0.117700	0.115800	0.093050	0.115100	0.129775	0.088600	0.120950	0.125450	0.099100	0.109750	0.111075	0.037950	0.031300	0.044000	0.205475	0.063850	0.217100	0.141900	0.210300	0.004700	0.220500	1.331484e+01	1.000000
max	1.824698e+12	42138.663900	1.537249e+12	4.621600e+11	1.153700e+10	1.856830e+11	3.056050e+11	1.565540e+11	3.152300e+10	8.720500e+10	3.970800e+10	4.917409e+09	8.368000e+09	2.339970e+11	2.741588e+09	2.339970e+11	8.028004e+06	6.624003e+06	1.112919e+11	1.112919e+11	10100.664000	2.659900	83.985400	97.590200	83.817600	45.292000	2.337210e+11	2.339970e+11	2.339970e+11	97.587700	97.396100	5.123080e+11	4.604800e+10	8.570900e+10	1.571052e+11	4.007370e+11	1.221035e+12	9.130160e+11	3.861800e+11	1.301620e+11	6.650100e+10	7.283810e+11	2.035703e+13	4.108950e+11	2.758350e+11	8.912570e+11	4.412529e+12	4.412529e+12	2.515800e+10	4.416800e+10	1.447804e+12	7.283810e+11	1.856536e+13	2.966300e+10	4.083840e+11	1.465725e+12	2.043074e+12	4.412529e+12	7.147500e+11	2.268290e+11	1.149940e+11	4.279000e+09	9.600000e+11	5.389408e+08	1.065800e+10	1.498722e+11	1.445908e+11	3.436000e+10	1.444008e+11	0.000000e+00	5.159800e+11	3.230100e+10	2.087463e+11	9.887000e+10	296.856700	8.786033e+07	8.786033e+07	26397.558500	4972.650400	11043.963600	48212.903200	3933.316576	7363.425186	25730.200000	410.898697	47022.618000	1846.587123	97.590197	74.922705	99.750000	2.659945	1.0	40.720398	97.396123	85.906977	345.631000	7.018763e+06	606.105500	99.750000	74.922705	97.590197	640.898700	14112.567800	446511.857143	19.595573	5852.228000	5850.927773	225.801591	89.207300	28768.636400	134.5455	192421.363600	93.0392	26.238900	2131.512800	53.618462	53.618462	38988.418600	863187.771511	2.851266e+06	1.155862e+07	1.030207e+07	4.823294e+07	180.900000	57805.094665	8.308966	863187.771511	863187.771511	64384.913180	64384.913180	113.453608	2.193725e+08	5.153974e+06	1.155862e+07	1.030207e+07	4.823294e+07	6.163623e+08	1.095290e+09	1.407759e+08	2.456253e+08	9.619200e+13	6.096634e+11	4972.650400	26397.558500	48212.903200	11043.963600	8.786033e+07	8.786033e+07	23710.175200	35296.196000	33522.612500	44509.072600	85.400000	293.817600	2131.512800	26.238900	914.406300	5852.228000	38988.418600	2086.250000	6.242800	180.900000	1540.600000	2156.000000	0.927400	596.823500	18.451400	35298.333300	1267.600000	540174.926600	606.105500	345.631000	283.038900	1.083289e+13	2.515999e+12	5.332900e+10	3.052798e+12	1.576925e+11	2.091685e+11	4.483400e+10	28768.636400	192421.363600	89.207300	108419.771000	640.898700	14112.567800	7.018763e+06	1.255873e+06	336767.800000	13562.153200	13545.338700	475.714300	136.000000	134.000000	3935.709000	3788.339100	81.278200	829.415100	2481.083300	1.747000	3.508500	11.836500	1.732700	4.186200	4.443100	0.984100	2.592300	17.028200	2.407000	6.031600	12.439600	0.574600	2.981100	3.605800	9328.200000	76.625000	5468.426400	1360.125000	729.576600	1542.611000	225.690000	2.418601e+06	1.000000

	train-rmse-mean	train-rmse-std	test-rmse-mean	test-rmse-std
0	0.493484	0.000296	0.495126	0.000110
1	0.488064	0.000293	0.491800	0.000202
2	0.483503	0.000773	0.488921	0.000016
3	0.479276	0.000832	0.486460	0.000160
4	0.475218	0.001051	0.484481	0.000092