Author Archive: Yin Wang

RubySonar: a type inferencer and indexer for Ruby

(This is a reposting of a blog post I wrote on Sourcegraph‘s website) Since the last enhancement to our Python analysis, we have been working on improving the Ruby part of Sourcegraph by building a similar static analysis tool for Ruby. The result is a new open-source project RubySonar. With its help, now we are …

Continue reading

Tests and static analysis

Ever since I made a static analysis tool for Python called PySonar, I have been asked about the question: “What is the difference between testing and static analysis?” When I worked at Coverity, my coworkers told me that they also spent quite some time explaining to people about their difference. My answer to this question …

Continue reading

What makes python static analysis interesting and hard

(The original version of this post appeared in my company’s blog. Here is just a personal copy of it.) Python is a dynamic language in the authentic sense. It represents the world-view that types don’t really exist in reality, and this is probably right. Type systems are “thinking tools” that only exist in our head. …

Continue reading

PySonar2 successfully integrated with Sourcegraph.com

(The above picture was taken from: http://sourcegraph.com/github.com/mitsuhiko/flask) I recently joined a newly founded company called Sourcegraph.com. We build an intelligent code search engine using some of the most powerful programming language technologies. The difference between Sourcegraph and other code search sites is: Sourcegraph truly understands code and produces accurate results. Sourcegraph lets you semantically search and …

Continue reading

A pure logic negation operator for miniKanren

Have you ever noticed that some examples from The Reasoned Schemer are not so reasonable? If so, you may want to read this post. Motivation miniKanren is an educational logic programming language designed by Dan Friedman, William Byrd and Oleg Kiselyov. For teaching logic programming, they also co-authored the book The Reasoned Schemer (TRS). As a person hugely …

Continue reading

psydiff: a structural comparison tool for Python

(click on the above picture for a demo) Psydiff is a structural comparison tool for Python, written in Python itself. The main algorithm and UI of psydiff is almost the same as ydiff. If interested, you can see a demo of it here (psydiff comparing itself): http://www.yinwang.org/resources/pydiff1-pydiff2.html All the source code can be downloaded from my …

Continue reading

Null reference may not be a mistake

The null pointer is considered to be a “billion-dollar mistake” by Tony Hoare, but was he really saying that null pointer should never be used? After years of using languages both with null pointers (e.g. Java) and without them (e.g. Haskell), I found that null pointers are much easier and more natural to use than its …

Continue reading

How to reduce the contagious power of ‘const’

For the purpose of optimization and (possibly?) safety, the C++ programming language decided that the users should manually specify ‘const’ annotations for variables that they know will/should never be modified. While I admit the possible usefulness of this annotation, I often see it misused, and this often results in a contagious effect where unnecessarily many variables and parameters …

Continue reading

Why is indexing faster than binary search

We all know that indexing into an array takes only O(1) time, while searching for a number in a sorted array takes O(n) time with linear search, and O(log n) time with binary search. But why is indexing so fast? What is the secret sauce? The reason is really about the nature of indexing — …

Continue reading

On pureness

It is usually a pleasure to use a functional programming language, but it is doubtful whether we should care about the “pureness” of the language. To write programs in a completely pure functional programming language (e.g. Haskell) is much like living in a wired world. There are no electromagnetic waves nor sound waves in such a world, …

Continue reading

Undecidability proof of the halting problem without diagonalization

As a TA for a graduate level Theory of Computation course, I don’t understand why we have to teach the diagonalization proof of the undecidability of the halting problem. It is much easier if we just use a normal argument about functions in a programming language. Basically, functions correspond to Turing machines in a theory of computation …

Continue reading

A mind map guide for TeXmacs

TeXmacs was my best kept secret for homework assignments and papers with lots of mathematical formulas. It is a state-of-the-art typesetting system with unparalleled beauty. Many people think that it is just a “frontend” of TeX/LaTeX (like LyX), but actually TeXmacs is fundamentally superior than TeX. It inherited some good designs of TeX, but then surpassed …

Continue reading

Understanding the yin-yang puzzle

I have a friend who is a very fast learner. I introduced him to the Scheme programming language a few days ago and he soon digged into this yin-yang puzzle: (let* ((yin ((lambda (cc) (display #\@) cc) (call/cc (lambda (c) c)))) (yang ((lambda (cc) (display #\*) cc) (call/cc (lambda (c) c))))) (yin yang)) This program …

Continue reading

On software design patterns

In this post I try to answer the contraversial question Are software design patterns good or bad? This used to be a long post, but I have since condensed it into a very short one, thanks to the pictures. One picture is worth ten thousand words. Hopefully you will have understood most of Richard Gabriel’s …

Continue reading

Concurrent stack does not exist

Some weeks ago I read this thought-provoking article by Nir Shavit Data Structures in the Multicore Age. The topic is about efficiency of concurrent data structures. There have been many thinkings going on after the reading, but the most interesting point to me is that the author started trying to use a “concurrent stack”, but …

Continue reading

How to reinvent the Y combinator

I always believed that one can never learn the essence of anything without reinventing it. Once you reinvent things, you will never forget them — because otherwise you can just re-reinvent them. Today I found that I forgot how to derive the defintion of the Y combinator. I learned it several years ago from an …

Continue reading

A reformulation of reducibility

I found the theory of computation books very imprecise about their descriptions of Turing machines and reductions. They usually start with something precise and mathematical, for example they would define a Turing machine as a 7-tuple, but after that, everything about decidability and reduction is described with impenetrable paragraphs in natural languages. I believe that …

Continue reading

ADTs and objects

After reading William Cook’s essay On Understanding Data Abstraction, Revisited, let me try to condense the difference between abstract data types (ADTs) and objects into a few words. (To avoid ambiguity, I use “instances” to stand for data created by instantiating ADTs) “Instances” created by the same ADT share the same functions (methods). Functions may …

Continue reading

A bug in GHC’s type system

Several days ago, I implemented an experimental type inference system with first-class polymorphism. When comparing it with other systems, I found a possible bug in GHC’s type system regarding universal quantification. The phenomemon was confirmed and reproduced by people at #haskell IRC channel for GHC versions above 7.01. The code that causes trouble is: gen …

Continue reading

On point-free programming

Concatenative programming, or point-free style, is useful sometimes, but has some serious drawbacks similar to the SKI combinators. Applicative programs can be compiled into point-free style easily, but writing and reading them directly in large scale is usually a mental burden. It only works well with functions with one or two arguments. Concatenation of functions …

Continue reading