Posts

Automated Debugging of Mixed-Integer Models (Draft)

Motivating Example Suppose that you have an MLIP: $$\begin{align} P_0: \text{maximize} \quad & c_1x_1 + c_2x_2 \\ \text{subject to} \quad & a_1x_1 \leq L_1 \tag{1L, R_1 limit} \\ \quad & a_2x_2 \leq L_2 \tag{2L, R_2 limit} \\ \quad & b_{1,1}x_1 + b_{2,1}x_2 \geq D_1 \tag{1D} \\ \quad & b_{1,2}x_1 + b_{2,2}x_2 \geq D_2 \tag{2D} \\ \quad & x_1 \geq 0, \quad x_2 \geq 0 \\ \quad & x_1, x_2 \in \mathbb{N} \end{align}$$
Read more

Reading list: UDF compilation

Results on UDF compilation discussed in the “Server-side Logic Execution” lecture of CMU 15-721 were interesting. The first result in this area was: K. Ramachandra, et al., Froid: Optimization of Imperative Programs in a Relational Database (VLDB, 2017) https://arxiv.org/abs/1712.00498 A subset of UDF can be compiled to SQL. Then standard query optimization routines apply. Decorrelation of user defined function invocations in queries (2014, link) is what the Froid paper is based on.
Read more

How to compute IIS

Gurobi says their algorithm is based on the paper “Identifying Minimally Infeasible Subsystems of Inequalities” from 1990. This thread from Gurobi has some discussion on how Gurobi does it. An article by Chinneck. These slides from Chinneck. “Identifying Minimally Infeasible Subsystems of Inequalities” By John Gleeson, Jennifer Ryan. 1990. Link Notes: \( y_i := 1 \text{ iff the i^th constraint is deleted.} \) \( S_i := \text{ the set of constraint indices in the j^th IIS.
Read more

Simple properties of IIS

IIS stands for “Irreducible Infeasible Subsystem”. There is no Wikipedia entry for it. I’ve been using the definition from a Gurobi article. Let \( I_0 \) be a subset of constraints. \( I_0 \) is an IIS iff: \( I_0 \) is infeasible. Removing any single constraint (“or bound”) from \( I_0 \) makes it feasible. (I’ve never thought about what the “or bound” part is doing.) Property 1: an IIS does not consist of disjoint infeasible systems Example: the following constraints do not comprise an IIS.
Read more

Notes: Data Diffs

I came across Data diffs: Algorithms for explaining what changed in a dataset. The two papers discussed are: Scorpion by Eugene Wu and Sam Madden. Finds common properties of outlier points. DIFF. SQL implementation of Scorpion and similar. The computated can be distributed. An author of the DIFF paper, Peter Bailis, founded sisudata.com. The comments on HN list some related work: Dolt is a company whose product is a version-controlled SQL database.
Read more

Notes: Compressed Sensing

Lecture video, and what Terrence Tao has to say about the topic. The video is associated with this textbook. I came across the Coursera version of this course around 2013. The data analysis section of this course seems to be motivated by signal processing. The course has some overlap with Stanford’s EE263 (linear methods).

Learning Optimization, Stats, and ML

Optimization I’m interested in learning mathematical optimization (LP, ILP, convex opt), and stochastic problems of a similar flavor (stochastic control?). Probably for historical reasons, convex optimization is taught in EE, and integer programming is in OR. Stanford EE263 Linear Dynamical Systems An applied linear algebra course. Interesting content, good problems. “Linear methods can solve lots of things” is a statement that I often hear but without examples. This course makes that statement concrete.
Read more

ML theory papers

Mert Pilanci Neural Networks are Convex Regularizers Mert Pilanci. Any finite two-layer neural network with ReLU has an equivalent convex problem. Talk at Stanford. An audience member said that you want SDG to converge to a local minumum (for generalization?). That doesn’t sound right? A previous result showed that an infinite-width network is convex. This paper on the other hand gives a convex problem that can be implemented and solved. All Local Minima are Global for Two-Layer ReLU Neural Networks
Read more

Learning CS Systems

Online resources that I’ve found useful for learning computer science, especially systems. Last updated on 2019-11-23. Systems Beyond CMU 15-213. CMU 15-440 (Fall version) The textbook (Van Steen & Tanenbaum) is readable. The slides are informative as supplements to corresponding textbook chapters and referenced articles. Lab assignments are done in Go. Tons of debugging concurrency issues. 0. Go warm-up. A simple key-value server. Implement a subset of TCP starting with UDP.
Read more

Notes: Curse of Dimensionality

Geometry In \( \mathbb{R}^n \) where \( n \) is large, The volume of the unit ball goes to 0 as the dimension goes to infinity. Most of the volume of a n-ball is concentrated around its boundary. Pick two vectors on the surface of the unit ball independently. The two are orthogonal with high probability. Johnson-Lindenstrauss lemma: a set of points can be embedded almost isometrically w.r.t. \( L_2 \) into a space of a much lower dimension.
Read more