Pages

2022-12-08

Go for C# developers: LINQ

When I worked in C# I loved LINQ. I also probably used it more than I should have. I have recently looked into some options to bring LINQ to a Go project so let me share some observations.

TL;DR: I don't think we'll see much use of LINQ in idiomatic Go. Nor should we desire it.

But before looking at options, what does LINQ bring to the table? In my opinion LINQ brings two good things to the table.

First it makes it very concise to filter and manipulate collections of data. It can possibly be argued that this conciseness makes programs harder to understand and I agree. Just like any program can be written in an easy or hard way to understand, the usage of LINQ sometimes makes it easier to write hard to understand code. This can however be mitigated by splitting up the LINQ statement into multiple statements.

Second and the most important reason I used LINQ is that it defers enumeration until it is needed. This makes it very easy to abstract fetching data in batches etc, and still do complex computations on the data. This benefit is obviously lost of you sort, or want distinct elements, but a lot of the time this deferred enumeration made my code both efficient and easy to understand.

ahmetb/go-linq - using reflection

This is a package that predates generics in Go and correctly have identified that there is no easy way to implement LINQ using generics. The good news is that if you are looking for a package that let's you write LINQ just like in C# this is the package you are looking for. The bad news is performance. There is a lot of memory allocations and since it is using reflection you don't have compile time type safety.

ahmetb/go-linq pull request - adding generics

The main reason why adding generics to this package is hard is because the generic types can only be on functions types, not on receiver methods. Hence something like Select becomes a little cumbersome to implement. This pull request works around that limitation by introducing a new intermediary type that holds all the types that are needed and hence increases the length of your LINQ statement. That intermediary step also becomes confusing I think.

samber/lo - using generics

This package is really a bunch of small helpers and does not claim to be a replacement for LINQ. This package will not allow you to make any "method trains" as each function will take the collection as an argument (i.e. receiver methods not used). These helpers also do not defer enumeration as they operate directly on slices and channels which means performance is just as good as if you manually do your processing. Since these helpers don't do anything other than hide a few for statements from you, the usage seems questionable to me, and definitely not idiomatic to Go where explicit implementations is preferred.

szmcdull/glinq - copying IEnumerable

This is an interesting approach where similar to samber/lo the constraints with generics is circumvented by only using functions rather than receiver methods. It introduces a copy implementation of IEnumerable from C# to defer enumeration. I think this implementation could have been much more idiomatic reusing the Iterable concept used in the internals of ahmetb/go-linq.

Conclusions

Given the constraints in how generics are implemented in Go I don't expect to see anything like LINQ be used very much. And I have come to think that is a good thing. A long chain of function calls where each of them perform fairly simple tasks does not really make the code easier to understand. Even when using an approach where each part of the chain is independent (as in the last two examples above) the value of these generic methods is very limited compared to explicitly perform the desired actions.

Since LINQ really shines when it comes the the deferred enumeration of (partially) asynchronous steps, there already is a good solution for that in Go; channels. So the use of an interface like in the last example above is again not adding much value.

Another interesting observation is that maybe LINQ is a terribly idea in the first place. In all languages. It is certainly nice to create a chain of actions (i.e. a data processing pipeline) and reuse that on different pieces of data, regardless of if the data is available all at once or not. And it might be helpful to have this data pipeline hide that fact. However, I don't think that means it is the most maintainable solution to achieve the same result. At the end of the day, maybe LINQ is best suitable to allow people familiar with databases to easily convert their SQL skills into another programming language, but I am no longer convinced LINQ is a good tool if you know the language well and want to do data processing.

Final thoughts

Since this was not the outcome of my research into LINQ for Go and I'm sure this opinion is not the most popular (especially among C# programmers), I'd love to get examples (in Go and C#) where a LINQ statement is actually easier to understand than the alternative (explicit code). And remember shorter does not necessarily mean easier to understand!

Here is an example I created that is border line better as LINQ. The use of Distinct and Sort is very short and clear, but the Go version is not much longer and follows the usual patterns for achieving the same thing. And even in a simple example like this, the non-LINQ version has the benefit of doing Select and Distinct in one place rather than separating them.

No comments:

Post a Comment