Thursday 6 April 2017

Data Science Interview Questions - Collections from other websites

https://www.dezyre.com/article/100-data-science-interview-questions-and-answers-general-for-2017/184

R


SciPy


NumPy


Python Pandas


Kafka


Cassandra


Hadoop


Spark


Learning Algorithms


Supervised


Unsupervised

Poisson Regression


Logistic Regression

What is Logistic Regression?

What can be types of predictors (independent variables) and dependent variable?
Predictors can be continuous, categorical or mix of both.
Dependent variable is categorical (nominal), binary or multinomial.

What is Binomial LR? Multinomial LR? Give some real world examples.

When to use Logistic Regression?

Write hypothesis.

What is logistic function?

What does logistic function actually do?

Write cost function.

Explain how cost function adds cost (error) in case when hypothesis predicts wrong outcome and how it handles the case when hypothesis predicts right value.

Write regularized cost function.

References:
https://www.r-bloggers.com/how-to-perform-a-logistic-regression-in-r/
https://github.com/BojanKomazec/machine-learning-stanford-course/blob/master/machine-learning-ex2/ex2.pdf



Linear Regression


Generalized linear model


linear regression
logistic regression
etc...

References:
https://en.wikipedia.org/wiki/Generalized_linear_model

Normal (Gaussian) Distribution

Describe Normal (Gaussian) Distribution

Write probability density function, explain its parameters.

Draw couple of diagrams with examples of normal distribution and how parameters of its probability density function shape them.

References:
https://en.wikipedia.org/wiki/Normal_distribution

Wednesday 5 April 2017

Distributions


List some distribution types.
bernoulli
multinomial
gaussian
poisson
gamma
tweedie
laplace
quantile
huber


References:
https://en.wikipedia.org/wiki/Probability_distribution
https://en.wikipedia.org/wiki/Multinomial_distribution
https://en.wikipedia.org/wiki/Bernoulli_distribution
https://en.wikipedia.org/wiki/Quantile_function

Collections in R

How to get number of rows/columns in data object x?
nrow(x)
ncol(x)


How to get first/last N elements of some data object x?
head(x)
tail(x)


Reference:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/nrow.html
https://rdrr.io/r/utils/head.html

Gradient Descent

What are the types of Gradient Descent?
  • batch 
  • minibatch 
  • stochastic gradient descent