Basic

An assortment of code snippet collections and useful code blocks that I have applied to various projects.

Classification algorithm comparisons

Algorithm comparison on finance loans
algo_result = pd.DataFrame(columns=['Algorithm', 'F1-score', 'Jaccard', 'LogLoss'])
algos = ["KNN","Decision Tree","SVM", "LogisticRegression"]
for algo in algos:
if algo == 'KNN':
yhat=clf.predict(test_X)
log_val = 'NA'
if algo == 'Decision Tree':
yhat= loanTree.predict(test_X) #Running the Decision Tree evaluation metrics
log_val = 'NA'
if algo == 'SVM':
yhat= loanS.predict(test_X) #Running the SVM evaluation metrics
log_val = 'NA'
if algo == 'LogisticRegression':
yhat= LR.predict(test_X) #Running the LogisticRegression evaluation metrics
log_val = log_loss(test_y, LR.predict_proba(test_X))
algo_result = algo_result.append({'Algorithm': algo,
'Jaccard': jaccard_similarity_score(test_y, yhat),
'F1-score': f1_score(test_y, yhat, average='weighted'),
'LogLoss': log_val}, ignore_index=True)
algo_result.style.hide_index()
view raw algo_comps.py hosted with ❤ by GitHub

The output of the above code shows a table of algorithm comparisons where the support vector machine(SVM) yielded the best result.

Resample and Summarize Time Series Data With Pandas – Daily to Weekly Summary

Example of resampling daily trading activity into weekly in order to later perform a comparison against a separate dataset which is only collected on Wednesdays.
#convert daily data to weekly for each client
weekly_trades = FT_volumes.groupby("Client").resample('W-Wed',
label='right', closed = 'right', on='Date').sum().reset_index().sort_values(by='Date')
weekly_trades.reset_index(inplace=True)
weekly_trades.head()