# Other articles

# So you're Running a Data project

A project workflow:

Establish the Need - what is the actual value generated by the work? This stuff only works if you have some measurable outcome for the business. This helps with the often difficult sell to upper managament!

Produce an "ideal" solution - Dream a little - how would it work if ...

read more# Machine Learning for the Common Man

# They say.....

## Supervised Learning

- Classification
- Regression

# You say....

## Educated Guesser given some known data

- Guesses words and Integer labels
- Guesses Real values

# They say....

## Unsupervised Learning

- Manifold Learning
- Clustering
- Latent Factor Analysis

# You say.....

## Explore relationships within the data

- Visually display relationships
- Group
- Find underlying drivers

# Workshop Slides

Here's the slides from my workshop at Digital City Innovation

read more# Feature Hashing in Coffeescript

Feature Hashing is a useful technique for dealing with sparse or "bag-of-words" type data-sets. The essential idea is to hash the features into a new feature space of a prespecified vector size. The algorithm employs hashing on two levels - the first is the hashing to the hash-space index, the second ...

read more# Practical Matrix Factorization

# Introduction

The problem in hand is to estimate unknown probabilities (missing data) in an activity matrix that indicates the interaction between small molecules and proteins. The activity matrix is initially binary, with 1 indicating positive interaction and 0 indicating an unknown interaction. The task is to replace the unknown interactions ...

read more# Mapping and Clustering the Premiership in Python

## Introduction

The aim of this tutorial is to gain practical experience of performing mapping and clustering, using the same premiership data-set used in the classification tutorial. This tutorial is deliberately shorter than the previous as you should be up and running with Python.

## Mapping

Mapping is a way of letting ...

read more# Mapping and Clustering the Premiership in R

## Introduction

The aim of this tutorial is to gain practical experience of performing mapping and clustering, using the same premiership data-set used in the classification tutorial. This tutorial is deliberately shorter than the previous as you should be up and running with

**R**.## Mapping

Mapping is a way of letting ...

read more# Predicting the Premiership with K-nearest neighbour in Python

## Predicting the premiership with the

*K*-Nearest Neighbour AlgorithmBefore we start there's two mathematical words that need explaining, these are

read more*Vector*and*Dimension*. Vector means something quite specific mathematically but in our case just think of it as a list of numbers. For example, let's take a ...

Page 1 / 3 »