Home Run Modeling Part 1: Base Model

Why home runs? Some of the best moments in baseball games are home runs. Something about hitting the ball out of the park is satisfying. Since baseball season just started, I wanted to model a part of the game. I decided to model home runs since they are pretty rare events but should still be able to be accurately predicted. When I say accurately predicted, I mean that we can accurately predict the probability of a player hitting a home run....

February 28, 2024 · 10 min · Lucas Pauker

OpenAI Model Timing

Introduction The goal of this article is to explore the latency of different OpenAI models. When using AI models in production, latency is an important factor to consider. Comparing Model Architectures First, I test the latency for different OpenAI models. I test the following models: gpt-4, gpt-4-0613, gpt-3.5-turbo, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-16k-0613, text-davinci-003, text-davinci-002, text-davinci-001, text-curie-001, text-babbage-001, text-ada-001, davinci-002, babbage-002, davinci, curie, babbage, and ada. These are all the OpenAI models that are available for inference through the chat and completions endpoints....

October 15, 2023 · 4 min · Lucas Pauker

LLMs Unleashed: The Power of Fine-Tuning

Disclaimer: This article mentions https://terra-cotta.ai/, an LLM experimentation platform I am building Introduction ChatGPT, Bard, and other large language models (LLMs) are very useful for a wide variety of tasks from writing code to answering complex questions to aiding with education. However, these models are ultimately limited by the data that they are trained on. Also, these models are trained to be able to answer a wide variety of questions which may not be sufficient for domain-specific questions....

July 23, 2023 · 5 min · Lucas Pauker

50 AI Applications

Advancements in artificial intelligence and language models have made significant impacts in various fields from healthcare to finance to entertainment. Here are 50 practical applications of AI that are currently in use or have the potential to be implemented in various industries. Let me know if any of these ideas inspire you or if you build any of them! Text Analysis Automatically generate outlines or summaries of news articles. Find fake news and provide a citation with the real source....

April 4, 2023 · 3 min · Lucas Pauker

Blackjack Reinforcement Learning

Introduction I recently read Ed Thorpe’s Beat the Dealer, a book about how Thorpe, a mathematician, found a way to gain an edge in blackjack. In the book, Thorpe uses computer simulations to calculate the best blackjack strategy as well as card-counting strategies. Since I took a reinforcement learning class last quarter, I wanted to apply one of the most common algorithms, Q-learning, to to find the best strategy for blackjack....

April 5, 2021 · 11 min · Lucas Pauker

Achieving Quantum Supremacy, Qubit by Qubit

Faster than a Supercomputer? In the 1980s, American physicist Richard Feynman proposed the idea of quantum computers to model complex quantum systems. In October 2019, around 40 years later, Google AI and NASA scientists unveiled a quantum computer which ran an experiment in a few minutes that would take the fastest supercomputer 10,000 years. The quantum computer sped up the computation by a factor of 1 billion! This was one of the first major successes in the nascent field of quantum computing....

March 11, 2021 · 8 min · Lucas Pauker

Simple Stock Market Models with Python

Introduction In this blog post, I will implement a few simple time series models of a stock price over time. I will also see how they do if we trade using them. We will look at moving averages (MA) and exponential moving averages (EMA). Data First, we need to download the price data. For this article, we will use SPY historical open price data. We can download this from Yahoo Finance....

December 20, 2020 · 4 min · Lucas Pauker

Solar Flare Time Series Research

Introduction I spent the summer of 2019 as a physics research intern at the Stanford University Solar Lab. I was very fortunate to have a wonderful advisor and had a great summer overall. I created machine learning models to characterize time series data for solar flare prediction. In this article, I will first provide some physics background about solar flares, then dive into my research. For a more in-depth analysis, check out the source code and my poster....

September 15, 2019 · 10 min · Lucas Pauker

Classical Music Classifier Project

Introduction This project done for my CS221 class aims to classify classical music by musical era (Baroque, Classical, Romantic, Modern) with composers as a proxy. Using audio processing techniques, such as Short-time Fourier Transform, we extracted features such as the spectrogram and chromagram of the audio data from two datasets, Free Music Archive and MAESTRO. We used two ensemble classifiers, AdaBoost and Random Forest, and found that although Adaboost performed marginally better than Random Forest, the latter made more generalizable predictions....

June 28, 2019 · 7 min · Lucas Pauker