# So you're Running a Data project

A project workflow:

Establish the Need - what is the actual value generated by the work? This stuff only works if you have some measurable outcome for the business. This helps with the often difficult sell to upper managament!

Produce an "ideal" solution - Dream a little - how would it work if ...

# They say.....

## Supervised Learning

- Classification
- Regression

# You say....

## Educated Guesser given some known data

- Guesses words and Integer labels
- Guesses Real values

# They say....

## Unsupervised Learning

- Manifold Learning
- Clustering
- Latent Factor Analysis

# You say.....

## Explore relationships within the data

- Visually display relationships
- Group
- Find underlying drivers

# Workshop Slides

Here's the slides from my workshop at Digital City Innovation

Feature Hashing is a useful technique for dealing with sparse or "bag-of-words" type data-sets. The essential idea is to hash the features into a new feature space of a prespecified vector size. The algorithm employs hashing on two levels - the first is the hashing to the hash-space index, the second ...

# Introduction

The problem in hand is to estimate unknown probabilities (missing data) in an activity matrix that indicates the interaction between small molecules and proteins. The activity matrix is initially binary, with 1 indicating positive interaction and 0 indicating an unknown interaction. The task is to replace the unknown interactions ...

## Introduction

The aim of this tutorial is to gain practical experience of performing mapping and clustering, using the same premiership data-set used in the classification tutorial. This tutorial is deliberately shorter than the previous as you should be up and running with Python.

## Mapping

Mapping is a way of letting ...

## Introduction

The aim of this tutorial is to gain practical experience of performing mapping and clustering, using the same premiership data-set used in the classification tutorial. This tutorial is deliberately shorter than the previous as you should be up and running with

## Mapping

Mapping is a way of letting ...

## Predicting the premiership with the

*K*-Nearest Neighbour AlgorithmBefore we start there's two mathematical words that need explaining, these are

