The Science of Today’s Technology, Data Science

Technology today…

Recently, there has been a surge in the consumption and innovation of information based technology all over the world. Every person, from a child to an 80-year-old man, use the facilities the technology has provided us. Along with this, the increase in population has also played a big role in the tremendous growth of information technology. Now, since there are hundreds of millions of people using this technology, the amount of data must be large too. The normal database software like Oracle and SQL aren’t enough to process this enormous amount of data. Hence the terms ‘Big data’ and ‘Data science’ were coined. Big data has made quite an impact on the world and data science has recently risen to be one of the hottest topics. Now how are these two related?

What is data science?

It is the field of science where different scientific approaches and methodologies are combined in order to study information technology. In layman language, it is technically the science for studying data. This particular field has grown tremendously over the years and presently almost every university has professors and students researching on learning and exploring this field.

Why is it such a hot topic though?

There has always been a need to record the data made by people which will help in predicting the future and also in studying the evolution of people’s way of living. It here plays a big role in recording, managing and retrieving this data. It is required to manage the large number of patients being admitted to hospitals, cars being manufactured per day, predicting the climate condition of the future years and what not.

What more to know about it?

From the examples given above, you must have realized that technology is everywhere. Do you know how Netflix knows the movies and shows you might like? Well, it is all because of data science. It uses machine learning algorithms and approaches to understand the requirements of yours and helps you by being one step ahead of you. The languages which are used in this field are Python, Java, SQL, etc. Before you step into a world of data science, it is important that you have a good amount of knowledge of mathematics and computer science along with these languages. Both can be considered as the basic requirement of this subject.

There has been a rise in the demand of data science as a subject in the universities, but unfortunately, there is not a particular curriculum which can be followed in this field since it is a very generalized field. What’s interesting is that data science has been confused with data analytics many times. In case you face the same problem, you should know that the basic difference between the two fields is that whereas in data analytics one studies the past of the data, in data science you will not only study about the past but also the present and the future of data. It is also said that data science is the base of artificial learning and everyone knows how artificial intelligence has made a dramatic entrance into our lives.

Growth of Information Technology

With each passing day, the technology is getting advanced. It could not be truer for the computer technology. The information technology is on the verge of scaling new heights. Many new internet companies have become worldwide hubs for individuals across the world, offering freelance services. Whether it means an IT related workforce or just plain data entry portal or projects, those in the third world countries have taken great advantage of this sort of opportunity.

Freelance outsourcing has become a sort of a cottage industry, employing thousands of gifted computer scientists, developers and related professionals across the globe. Rich internet applications are becoming the norm as connectivity through broadband technology is being made available more readily. More web sites are transforming into web services that are interactive and have the capacity to keep individuals on the web kept, more focused.

Even the desktop applications are moving towards advancements. There are certain computing technologies available that remain novelty. One of them is touch computer systems. It remains the most promising candidate to cater to the needs of the next generation.

Online web surfing is becoming popular and many people have started to work on home based businesses and solutions for their earnings. It not only gives convenience and freedom of time but also relieves one of the stresses of official confines.

Social networking is one of the prime internet utility now a days and it is picking up pace as new web sites and internet users are rapidly growing in number. Here few rules affect the issues; first and the major rule among them would be the power of context, whenever the popularity of any web site or product rises, the context in which it was launched becomes of high relevance.

New social networking web sites are now continually fulfilling the demands of its users; they are helping people make new business contacts, find freelance work and find the best dinning or shopping places in the city.

The web sites usually sport a disclaimer notice, which thoroughly covers freedom of expression. The internet is thus far the freest governance in the world, it is fully covered by the cyber laws that allows people and permits them to do meaningful discussions even on topics of grave sensitivity and controversies.

As in the past centuries, say for example in the 17th century, the scholars associated them selves with many societies and communities through the exchange of letters. Today’s teenagers more specifically are doing the same by utilizing their potentials in a more easily accessible environment of cyberspace technologies with quite few risks involved.

The information technology and computers are playing a vital role in the world’s fast growing technological changes. Communications are becoming easier and the place once the mediums of television and radio had in an average person’s life are now almost taken up by the computers. The rapid growth of internet utility is bridging the gaps between all humans regardless of region , area or even genders and this technology stands really above all the others because there are no rigid rules present which could be considered confinement to the development of technology.

Information Technology Problem Solving – The 6 Principles of Scientific Problem Solving

This paper will explain a scientific approach to problem solving. Although it is written to address Information Technology related problems, the concepts might also be applicable in other disciplines. The methods, concepts, and techniques described here is nothing new, but it is shocking how many “problem solvers” fail to use them. In between I will include some real-life examples.

Why do problem solvers guess in stead of following a scientific approach to problem solving? Maybe because it feels quicker? Maybe a lack of experience in efficient problem solving? Or maybe because it feels like hard work to do it scientifically? Maybe while you keep on guessing and not really solving, you generate more income and add some job security? Or maybe because you violate the first principle of problem solving: understand the problem.

Principle #1. Understand the *real* problem.

Isn’t it obvious that before you can solve, you need to understand the problem? Maybe. But, most of the time the solver will start solving without knowing the real problem. What the client or user describe as “The Problem” is normally only the symptom! “My computer does not want to switch on” is the symptom. The real problem could be that the whole building is without power. “Every time I try to add a new product, I get an error message” is the symptom. Here the real problem could be “Only the last 2 products I tried to add gave a ‘Product already exists’ error”. Another classic example: “Nothing is working”…

You start your investigation by defining the “real problem”. This will entail asking questions (and sometimes verify them), and doing some basic testing. Ask the user questions like “when was the last time it worked successfully?”, “How long have you been using the system?”, “Does it work on another PC or another user?”, “What is the exact error message?” etc. Ask for a screen-print of the error if possible. Your basic testing will be to ensure the end-to-end equipment is up and running. Check the user’s PC, the network, the Web Server, Firewalls, the File Server, the Database back-end, etc. Best-case you will pint-point the problem already. Worst-case you can eliminate a lot of areas for the cause of the problem.

A real life example. The symptom according to the user: “The system hangs up at random times when I place orders”. The environment: The user enters the order detail on a form in a mainframe application. When all the detail is completed, the user will tab off the form. The mainframe then sends this detail via communication software to an Oracle Client/Server system at the plant. The Oracle system will do capacity planning and either returns an error or an expected order date back to the mainframe system. This problem is quite serious, because you can loose clients if they try to place orders and the system does not accept them! To attempt to solve this problem, people started by investigating: 1) The load and capacity of the mainframe hardware 2) Monitoring the network load between the mainframe and the Oracle system 3) Hiring consultants to debug the communication software 4) Debugging the Oracle capacity planning system After spending a couple of months they could not solve the problem.

The “Scientific Problem Solver” was called in. It took less than a day and the problem was solved! How? The solver spends the day at the user to see what the “real problem” was. It was found that the problem only occurs with export orders. By investigating the capture screen and user actions, it was found that with export orders the last field on the form is always left blank and the user did not tab off this field. The system was not hanging, it waited for the user to press “tab” another time. Problem solved. It can be noted that the “Scientific Problem Solver” had very limited knowledge of the mainframe, of the order capturing system, of the communication software, and of the Oracle capacity planning system. And this brings us at Principle#2.

Principle #2. Do not be afraid to start the solving process, even if you do not understand the system.

How many times have you heard “I cannot touch that code, because it was developed by someone else!”, or “I cannot help because I am a HR Consultant and that is a Finance problem”? If you washing machine does not want to switch on, you do not need to be an Electrical Engineer, Washing Machine Repair Specialist, Technician, or whatever specialist to do some basic fault finding. Make sure the plug is working. Check the trip-switch, etc. “I have never seen this error before” should not stop you from attempting to solve. With the error message and an Internet Search engine, you can get lots of starting points.

In every complex system there are a couple of basic working principles. System A that reads data from System B can be horribly complex (maybe a Laboratory Spectrometer that reads data from a Programmable Logic Computer via an RS-232 port). But, some basics to test for: Does both systems have power? Is there an error message in the event log on one of these systems? Can you “ping” or trace a network packet from the one system to the other? Try a different communication cable. Search the internet for the error message.

Once you have established what the problem is, you need to start solving it. Sometimes the initial investigation will point you directly to the solution (switch the power on; replace the faulty cable, etc). But, sometimes the real problem is complex in itself, so the next principle is to solve it simple.

Principle #3. Conquer it simple.

Let’s start this section with a real-life example. Under certain conditions, a stored procedure will hang. The stored procedure normally takes about an hour to run (when it is not hanging). So, the developer tried to debug. Make some changes and then wait another hour or so to see if the problem is solved. After some days the developer gave up and the “Problem Solver” took over. The “Problem Solver” had to his disposal the knowledge under witch conditions the stored procedure would hang. So, it was a simple exercise to make a copy of the procedure, and then with this copy to strip all unnecessary code. All parameters were changed with hard-coded values. Bits of code were executed at a time and the result-sets were then again hard-coded into the copy of the procedure. Within 3 hours the problem was solved. An infinite-loop was discovered.

What the “Problem Solver” did, was to replicate the problem and at the same time tried to isolate the code that caused the problem. In doing so, the complex (and time consuming) stored procedure became something fast and simple.

If the problem is inside an application, create a new application and try to simulate the problem inside the new application as simple as possible. If the problem occurs when a certain method for a certain control gets called, then try to only include this control in the empty application and call that method with hard-coded values. If the problem is with embedded SQL inside a C# application, then try to simulate the SQL inside of a Database Query tool (like SQL*Plus for Oracle, Query Analyzer for SQL Server, or use the code in MS Excel via ODBC to the database).

The moment you can replicate the problem in a simple way, you are more than 80% on your way to solve it.

If you do not know where in the program the problem is, then use DEBUG.

Principle #4. Debug.

Most application development tools come standard with a debugger. Weather it is Macromedia Flash, Microsoft Dot Net, Delphi, or what ever development environment there will be some sort of debugger. If the tool does not come standard with a debugger, then you can simulate one.

The first thing you want to do with the debugger is to determine where the problem is. You do this by adding breakpoints at key areas. Then you run the program in debug mode and you will know between which breakpoints the problem occurred. Drill down and you will find the spot. Now that you know where the problem is, you can “conquer it simple”

Another nice feature of most debuggers includes the facility to watch variables, values, parameters, etc. as you step through the program. With these values known at certain steps, you can hard-code them into your “simplified version” of the program

If a development tool does not support debugging, then you can simulate it. Put in steps in the program that outputs variable values and “hello I am here” messages either to the screen, to a log file, or to a database table. Remember to take them out when the problem is resolved… you don’t want your file system to be cluttered or filled up with log files!

Principle #5. There is a wealth of information on the database back-end that will help to solve a problem.

The “Problem Solver” was called to help solve a very tricky problem. A project was migrating system from a mainframe to client-server technology. All went well during testing, but when the systems went live, all of a sudden there were quite a few, and quite random “General Protection Faults”. (The GPF-error was the general error trap in Windows 95 and 98). It was tried to simplify the code, debugging was attempted, but it was impossible to replicate. In the LAB environment, the problem would not occur! Debugging trace messages to log files indicated that the problem occurred very randomly. Some users experienced it more than others, but eventually all users will get them! Interesting problem.

The “Problem Solver” solved this after he started to analyze the database back-end. Not sure if it was by chance or because he systematically moved in the right direction because of a scientific approach. Through tracing what is happening on the back-end level, it was found that all these applications were creating more-and-more connections to the database. Every time a user starts a new transaction another connection was established to the database. The sum-total of the connections were only released when the application was closed. As the user navigated to new windows inside the same application, more and more connections are opened, and after a specific number of connections, the application will have enough and then crash. This was a programming fault in a template that was used by all the developers. The solution was to first test if a cursor to the database is already open, before opening it again.

How do you trace on the back-end database what is happening? The main database providers have GUI tools that help you to trace or analyze what queries are fired against the database. It will also show you when people connect, disconnect, or were unable to connect because of security violations. Most databases also include some system dictionary tables that can be queried to get this information. These traces can sometimes tell ‘n whole story of why something is failing. The query code you retrieve from the trace can be help to “simplify the search”. You can see from the trace if the program makes successful contact with the database. You can see how long it takes for a query to execute.

To add to Principle#2 (do not be afraid to start…); you can analyze this trace information, even though you might not know anything about the detail of the application.

Remember though that these back-end traces can put a strain on the back-end resources. Do not leave them running for unnecessary long.

Principle #6. Use fresh eyes.

This is the last principle. Do not spend too much time on the problem before you ask for assistance. The assistance does not have to be from someone more senior than you. The principle is that you need a pair of fresh eyes for a fresh perspective and sometimes a bit of fresh air by taking a break. The other person will look and then ask a question or two. Sometimes it is something very obvious that was missed. Sometimes just by answering the question it makes you think in a new directions. Also, if you spend hours looking at the same piece of code, it is very easy to start looking over a silly mistake. A lot of finance balancing problems get solved over a beer. It could be a change of scenery, and/or the relaxed atmosphere that will pop out the solution. Maybe it is the fresh oxygen that went to the brain while walking to the pub. Maybe it is because the problem got discussed with someone else.

Conclusion

After reading this paper, the author hope that you will try these the next time you encounter a problem to solve. Hopefully by applying these six principles you will realize the advantages they bring, rather than to “guess” your way to a solution.