edgarhassler.com

Hungarian Method

A Bridge Between Chaos and Order

Maximal weight bipartite matching in \(O(n^3)\) time. Matching points fast!

Interpolating Temperatures

We frequently have temperature data at measurement sites provided by airports, but need to understand the temperatures at areas far between airports. Guassian Process regression is one way we can do this. Here, I use what used to be publicly available data to model temperatures at each zip code in Arizona.

Gaussian Process Regression Notes

I did a class project about Gaussian Process regression and I really enjoyed the topic so I thought, stupidly, “Hey, why not bring those ideas forward into a more friendly blog post?” The answer, I would find out weeks later, is that I will spiral out on it. Having severed the dangling appendages, here is the remains of that mistake. Notes on Gaussian Process regression!

Llama 3.2 on Apple Silicone

Part 2 of Trying to Justify a Macbook Pro

In my last post I talked about using Stable Diffusion 3 to automate headers for my website, really to show that I need a Macbook Pro. My goal was to automate making heading…

Stable Diffusion 3 on Apple Silicone

Trying to Justify a Macbook Pro for “Professional Development”

I’ve decided my blog posts are too boring and could use some spicing up. How about making heading images for my posts? But I don’t want to pay for a third party API. Also, I…

Python Web Scraping in 2024

Comparing Some Async and Parallel Methods for Performance

Web scraping requires a delicate balance to maintain high rates of throughput. Here, I examine the Python libraries aiohttp and httpx and compare them to the venerable requests library run under multiprocessing. Surprisingly, aiohttp did very well, suggesting that parallelism in the operating system’s management of sockets compensates for the single threaded nature of this async package.

The Formula and Factors in R

Many important models are defined in R using a special powerful notation due to Wilkinson and Rodgers. Another important aspect of modeling in R is how categorical factors are interpreted within the models themselves. Here, I discuss how to make use of these tools and how to avoid some common mistakes.

My Surprise at Closed Testing Procedures

Closed testing procedures are one way to guarantee that our family-wise type 1 error rates don’t rise above our nominal value. I was surprised when I read they can improve things like Dunnett’s procedure. Here’s a quick note on it.

Failing to Learn After Rules

Simulating a Machine Learning Approach to Data Generated from Strict Rules

Companies like to take machine learning models and apply them to historical data, assuming that things will shake out in the end. Here I present a simulated case study where this fails in a way that one might not expect.

Always Valid Sample Ratio Mismatch Monitoring

Sample ratio mismatch testing is a good place to apply fully sequential procedures. Here is a Robbins kind for A/B/n tests.

Haircutting A/B Tests?

Winner’s Curse and Other Ways to Shoot Ones Self in the Foot

In A/B testing, while our estimator of effect size is often unbiased, adding up all the significant effects turns out to be quite biased. This is called the curse of the winner, and I’ll explore how this harms many A/B testing programs, and show how controlling error rates does nothing to ensure our effect estiamtes are appropriate.

Cursed Example: Waiting for Events

Many A/B tests are time-to-event tests that are run as binomial tests of proportions run at particular times. We can have very different p-values depending on when we choose to analyze such tests.

Do People Behave Binomially?

In other words, is the two-sample binomial test of proportions approriate for A/B testing?

Frequently, one uses binomial random variables to model A/B tests, but is this appropriate? Here, I show why the concern is valid, and why we’re fine using a two-sample test of proportions for A/B tests.

Notes on Robbins’ SPRT

The mixture sequential probability ratio test is an extension of Wald’s SPRT that allows us to reduce the worst case performance by integrating the likelihood ratio over a family of alternative values. The resultant test guarantees our frequentist false alarm rate.

Notes on Wald’s SPRT

Wald’s sequential probability ratio test is a sequential procedure that allows one to make observations sequentially. The procedure terminates as soon as the evidence is sufficient to decide between \(H_0\) and \(H_1\), and can result in runtime savings over traditional “fixed horizon” approaches common to A/B testing.

Quantifying the CUPED Improvement with Binomial Responses

CUPED is a method for reducing the variance of an estimator by leveraging another correlated variable. Many tout the importance and impact of such a method on speeding up A/B tests, but CUPED is not a panacea, and here I show some cases where it provides little value.

Data Quality Control

A Requirement for Data for Data Science and Analytics Users

Data quality is important for modern analytics and data science teams. No one treats it that way. Here, I enumerate some ways to think about and monitor data quality.

Analysis Question Without Answers

Reflections on the difficulty of assigning a value to a product feature

Sometimes a question sounds simple but like some fractal of suffering as one gets closer and closer one finds things are underdetermined, too open to subjective interpretation, and no result will serve as an appropriate answer.

Multivariate Tests on the Web

Factorial and orthogonal array designs have a place in marketing experimentation, despite their unpopularity amongst some in the A/B testing world. Here, we get into some of the efficiency and philosophy behind these experimental designs.

Marketing Data Science BOM

A ChatGPT-Inspired Bill of Materials for Marketing Data Science Projects

What makes a marketing data science project successful? What is required to make this set of tools work for you? Much has been dedicated to techniques for individual models, feature engineering, et cetera, yet general requirements often fail to include all of the details. Here, I discuss some things critical to a general set of applications.

Alpha Spending for Group Sequential Trials

Group sequential designs offer analysts a lot of flexibility in that a test can be monitored with multiple analyses, and it can be stopped early when the test is likely to end up futile or already has strong evidence that the null hypothesis isn’t correct. This is my notes on these types of designs.

Notes on Hsu’s Method for Conversion Data

Hsu’s method is a way to compare treatments to the best of the remaining treatments all while preserving the overall type 1 error rate. The result is confidence intervals that tell you if a particular treatment is dominated by another or if the treatment is among the best. Here are my notes on the procedure.

Notes on Dunnett’s Method for Conversion Data

Dunnett’s method is a way to compare any number of treatments against a common control, all while preserving the overall type 1 error rate. This is a far more efficient approach than using Bonferroni corrections or doing a series of A/B tests. Here are my notes on the procedure.

Your A/B Test May Be Three Different Tests

People tend to think of A/B tests in terms of the simple test of a challenger over a control, but the data we collect often finds itself in three distinct uses that we often fail to plan for. Here, I discuss this and ways to address it.

A/B Testing: Oops I got A in my B!

Sometimes failed randomization can put people into both a control and challenger treatments, and just counting them for both or ignoring result in different conclusions…

Pandemic Christmas Tree 2020

A colorful Christmas tree that’s baby-safe! Trying out Twinkly lights!

You Probably Don’t Want To Pick The Winner

People often want an experiment to pick the winner, but the winner is significantly better than the next best option, and it can be very expensive to detect what could be a very small difference. Here I look at other options for analyzing A/B/n tests that can be more efficient.

Analysts Have Technical Debt, And It Will Kill You

Technical debt is well-ish understood by developers, and care is taken to make sure errors rarely make it into production. Yet analysts have to deal with technical debt too, and analysts are all-too-comfortable ignoring errors as a kind of statistical noise. However, these errors do not come from a nice distribution and can have devastating effects on an analysis.

DoE Fireside Chat on OCEAN

Anson and I sat down with Doug Montgomery to discuss design and analysis of experiments in an online setting. Anson and I also wore Hawaiian shirts!

Lighting a Christmas Tree 2019

Decorating the Christmas tree, 2019 edition!

Wilkinson-Rogers Notation: A Formula by any Other Name

The formula environment in R and implemented in Python’s statsmodels package is a powerful and expressive way to describe models and transforms to data. It’s often described as Wilkinson’s notation or Wilkinson-Rogers notation. Here, I break down how it works.

Python Package Importing

It turns out importing packages in an editable way breaks depending on how the packages were defined. It’s the kind of bug that, after driving one insane, makes one write about it to hopefully save anyone else the same trouble.

Suffix Trees for Log Aggregation

Email logs follow a standard in the same way toddlers follow a preschool teacher. Here, I use suffix trees to try to learn new patterns in log data.

Lighting a Christmas Tree 2018

Deck the halls with the results of my Christmas tree model! And learn how live trees fail to behave as cones! Wow!

Lighting a Christmas Tree

Let’s bring math to bear on decorating our Christmas tree! Here, I derive some equations and calculators to help get nice uniform lighting on a tree! And we kick it up a notch (do people still remember Emeril?) and create some nice patterns.

Taco Model

A taco festival, personal rating, Yelp data, machine learning, and finding the best tacos.

Fast Updating for Optimal Linear Model and Generalized Linear Model Experimental Designs

When things get big, refitting a GLM becomes too expensive, and instead you need to rely on some update functions that are in an expensive out-of-print Springer book or strewn across paywalled papers. Here, I derive (or motivate the derivation) of some useful updating identities for use in linear algebra for linear and generalized linear models.

Modeling Carryover Effects in Industrial Settings

Sometimes a test can leave lingering effects on the experimental unit. Here I try a simple approach to model this phenomenon and then use genetic algorithms to try to find a good design to mitigate as much as possible the impact of the carryover effects.

Testing for the Best of Times Without Knowing the Worst of Times

I was asked to answer a question about failure times when we had no data and the prospect of tremendous losses if we took a bad approach. Here’s my attempt to formulate some plan in such an environment.

Binomial Proportion Confidence Intervals are Ugly and Messy

Binomial confidence intervals will let you down and hurt you. Here, I look at how, when, and why, and look at alternative formulations of the intervals that have better behavior.