Three Reasons Research Scientists Should Avoid Programming and Data Analytics

Subtitle - This Story Does Not Need But Thank You Vocal for Forcing Me to Include One

By Everyday JunglistPublished 3 years ago • 6 min read

Although the hair fits the profile, this is almost certainly not a group of programmers and data scientists reading this article . Image by Sebastian Šoška from Pixabay

Intro and caveats

I recently read an article in which and engineer extolled the virtues of learning to program, and the value of advanced data analytics to his career. The article was persuasive and I have little doubt the advantages he mentioned were real. However, he was an engineer, not a scientist and what holds true for the one does not hold true for the other in this case. In fact I argue that learning to program or spending your valuable time building expertise in advanced data analytics is to the overall detriment of your career in research science. Obviously what I am about to say here will not apply in 100% of cases. No doubt there will be some exceptions, particularly in the “hard sciences” [e.g. experimental or theoretical physics (programming and data analytics)] or even the social “sciences” [e.g. psychology or behavioral science (data analytics)] there may be advantages or even a need for said skill sets, but I would suggest the standard training provided in the curriculum for those fields would already provide to the extent needed and therefore these comments would not apply. Therefore I will restrict my comments to research in the sciences in fields where the standard education/training programs would not normally include programming or advanced data analytics (except in a few special cases). Mainly of course I am thinking of the biological/life sciences though there are certainly others. With those caveats in mind, there are three reasons why I believe this is the case.

Reason 1: Data analytics puts the focus on the least important part of the scientific method

That data analysis is a part of the scientific method is no doubt true, however it is the least important part, and the part that can and is done by non-scientists all the time (e.g. data “scientists”). In fact a person is not even needed, it can and is often done by machines/computers. The fact that a computer can be programmed to do data analysis means that anybody could, hypothetically, do data analysis. Contrast that with hypothesis generation another key component of the experimental method. Despite what some have claimed, computers cannot generate original hypotheses. They cannot draw conclusions from the results of experiments they designed and use those conclusions to design further experiments to test them. In fact they cannot design original experiments either. If they could they should be proving it by publishing the results of their studies in the peer reviewed scientific primary literature. That they are not and have not is proof enough that they cannot. They can be used as an aid in designing experiments, for example, to do calculations or generate images or figures, etc. but they cannot, on their own design an experiment to test a hypothesis. It would be a poor use of a research scientists time to focus on data analytics for the simple reason that it is much more “cost” effective to hire a non scientist to do it or program a computer to do it. Original research science cannot be conducted with data analytics alone and therefore to focus ones energies on it is a waste of resources of which your typical research scientist has a very limited supply of.

Reason 2. Data analytics creates a false impression of progress when there is none

In research science it is often the case that progress comes slowly and is very often not noticeable until one is well into it. In data analytics “progress” in the form of an “answer” to the question posed of the data comes immediately. This can make it seem as if one is making progress on the actual problem at hand even if/when the question asked of the data is the completely wrong one or the data themselves are in error. Data analysis provides an answer every single time. The rightness or wrongness of the answer is not part of the analysis. It is simply not considered at all. The more advanced and complicated the analysis the easier it is to fall into this trap. How could I not be making progress when I now have 15 different graphs and charts breaking my data down in every imaginable way, all of it statistically significant, and probably very pretty to look at and important sounding too? A statistically significant wrong answer is still a wrong answer and 15 pretty, complicated, important sounding charts or graphs that are incorrect or misleading are just as incorrect as one ugly and simple chart or graph. In fact the 15 complicated but incorrect charts are probably worse because they create an even greater impression of progress when there is none. You may say, but it’s not data analytics fault, it is the scientists fault for using that wrong data or using it incorrectly. You would be correct to say that, however it is the data analysis that misled the scientist by obscuring or completely blinding him to the wrongness of his answer. He is ultimately responsible for that wrongness, but he did not create the false impression of progress, the data analysis did that.

Reason 3: Programming impedes your ability to think creatively

I saved this for last as it is by far the most contentious and controversial of my claims. Though some will no doubt take it as a criticism it is not intended as such. Programming requires a very specific skill set, and a very specific mode of thinking. It is a way of thinking that is circumscribed, it is boxed in if you will. It is a mode of thinking based on rules that cannot be broken. Based on a logic that is iron clad and never bending. It requires one to understand how to get from point A to point B in a certain way using only the rules of the programming language at hand. It requires a certain kind of creativity, but it is the creativity of the home builder or home painter not the architect or the artist. The home painter has to choose what colors best fit together and work in certain rooms of the house and not others, but he only has walls and doors, and trim, etc. to paint, and he only has a limited set of tools (brushes and rollers, etc.) and colors to choose from. The artist, in contrast, can choose to paint, or draw, or sculpt, or whatever, and can work from any pallet of colors or none at all. He may use a brush, or his hands, or a welding torch, or any tool he chooses, or he may use no tool at all. A similar comparison can be made of the architect and the home builder. The important point is that the more time one devotes to learning the specific task of painting houses or building homes (programming) the more difficult it becomes to be the artist or the architect. Research science requires one to be the architect or the artist in order to make real progress. Learning to program will be an impediment to that because it forces a mode of thinking that is simply not compatible with it.

courses