Altabel Group's Blog

Posts Tagged ‘data

What is the hottest trend in artificial intelligence right now? Machine Learning is the right answer! Thanks to technological advances and emerging frameworks, Machine Learning may soon hit the mainstream. Because of new computing technologies, Machine Learning today is not like Machine Learning of the past. While many Machine Learning algorithms have been around for a long time, the ability to automatically apply complex mathematical calculations to big data – over and over, faster and faster – is a recent development. Every single day it’s become clear that Machine Learning is already forcing massive changes in the way companies operate. Every Fortune 500 company is already running more efficiently — and making more money — because of Machine Learning. But how this “phenomenon” helps business bring money and attract new and new customers?

Problems that can be easily solved using ML

Every single business some time or other can face with definite problems. But there are some kinds of business problems Machine Learning can prevent if not handle at all:

Email spam filters
Some spam filtering can be done by rules (IE: by overtly blocking IP addresses known explicitly for spam), but much of the filtering is contextual based on the inbox content relevant for each specific user. Lots of email volume and lots of user’s marking “spam” (labeling the data) makes for a good supervised learning problem.

Speech recognition
There is no single combination of sounds to specifically signal human speech, and individual pronunciations differ widely – Machine Learning can identify patterns of speech and help to convert speech to text. Nuance Communications (maker of Dragon Dictation) is among the better known speech recognition companies today.

Face detection
It’s incredibly difficult to write a set of “rules” to allow machines to detect faces (consider all the different skin colors, angles of view, hair / facial hair, etc), but an algorithm can be trained to detect faces, like those used at Facebook. Many tools for facial detection and recognition are open source.

Credit card purchase fraud detection
Like email spam filters, only a small portion of fraud detection can be done using concrete rules. New fraud methods are constantly being used, and systems must adapt to detect these patterns in real time, coaxing out the common signals associated with fraud.

Product / music / movie recommendation
Each person’s preferences are different, and preferences change over time. Companies like Amazon, Netflix and Spotify use ratings and engagement from a huge volume of items (products, songs, etc) to predict what any given user might want to buy, watch, or listen to next.

Here is enumerated not all but just a few problems that can be solved. And with the course of time this list will only expand.

Industries that already use ML in action

Most industries working with large amounts of data have recognized the value of Machine Learning technology. The adoption of Machine Learning is likely to be diverse and across a range of industries, including retail, automotive, financial services, health care, and etc. By gleaning insights from this data – often in real time – organizations are able to work more efficiently or gain an advantage over competitors. In some cases, it will help transform the way companies interact with customers.

Retail industry
Machine Learning could completely reshape the retail customer experience. The improved ability to use facial recognition as a customer identification tool is being applied in new ways by companies such as Amazon at its Amazon Go stores or through its Alexa platform. Amazon Go removes the need for checkouts through the use of computer vision, sensor fusion, and deep or Machine Learning, and it’s expected that many shopping centers and retailers will start to explore similar options this year.

Financial services
Banks and other businesses in the financial industry use Machine Learning technology for two key purposes: to identify important insights in data, and prevent fraud. The insights can identify investment opportunities, or help investors know when to trade. Data mining can also identify clients with high-risk profiles, or use cyber surveillance to pinpoint warning signs of fraud.

Health care
Machine Learning is a fast-growing trend in the health care industry, thanks to the advent of wearable devices and sensors that can use data to assess a patient’s health in real time. The technology can also help medical experts analyze data to identify trends or red flags that may lead to improved diagnoses and treatment. Machine Learning can be used to understand risk factors for disease in large populations. For instance, Medecision company developed an algorithm that is able to identify eight variables to predict avoidable hospitalizations in diabetes patients.

Oil and gas
Finding new energy sources. Analyzing minerals in the ground. Predicting refinery sensor failure. Streamlining oil distribution to make it more efficient and cost-effective, and many others thing that you can do using ML. For example ExxonMobil, the largest publicly traded international oil and gas company, uses technology and innovation to help meet the world’s growing energy needs. Exxon Mobil’s Corporate Strategic Research (CSR) laboratory is a powerhouse in energy research focusing on fundamental science that can lead to technologies having a direct impact on solving our biggest energy challenges.

Government
Government agencies such as public safety and utilities have a particular need for Machine Learning since they have multiple sources of data that can be mined for insights. Analyzing sensor data, for example, identifies ways to increase efficiency and save money. Machine Learning can also help detect fraud and minimize identity theft. Chicago’s Department of Public Health is early adopter. It used to identify children with dangerous levels of lead in their bodies through blood tests and then cleanse their homes of lead paint. Now it tries to spot vulnerable youngsters before they are poisoned.

Marketing and sales
Websites recommending items you might like based on previous purchases are using Machine Learning to analyze your buying history – and promote other items you’d be interested in. This ability to capture data, analyze it and use it to personalize a shopping experience (or implement a marketing campaign) is the future of retail. PayPal, for example, is using Machine Learning to fight money laundering. The company has tools that compare millions of transactions and can precisely distinguish between legitimate and fraudulent transactions between buyers and sellers.

Transportation
Analyzing data to identify patterns and trends is key to the transportation industry, which relies on making routes more efficient and predicting potential problems to increase profitability. The data analysis and modeling aspects of Machine Learning are important tools to delivery companies, public transportation and other transportation organizations. In some cases, mathematical models are used to optimize shipping routes. By honing in on excessive driving routes, drivers can see a reduction of nearly one mile of driving every day. For a company like UPS, a reduction of one mile per day per driver would equal a savings of as much as $50 million a year in fuel.

Have you ever worked with ML? Was it useful for your business? Or maybe you are still thinking about whether it costs to implement Machine Learning in your business? Will it be relevant and defensibly? If you have an answer on at least one question – share with me your experience. We will be happy to discuss it in comments. But if you don’t have an answer, always remember – Big companies are investing in Machine Learning not because it’s a fad or because it makes them seem cutting edge. They invest because they’ve seen positive ROI. And that’s why innovation will continue.

 

Yuliya Poshva

Business Development Manager

E-mail: yuliya.poshva@altabel.com
Skype: juliaposhva
LI Profile: Yuliya Poshva

 

altabel

Altabel Group

Professional Software Development

E-mail: contact@altabel.com
www.altabel.com

The big languages are popular for a reason: They offer a huge foundation of open source code, libraries, and frameworks that make finishing the job easier. Sometimes the vast resources of the popular, mainstream programming languages aren’t enough to solve your particular problem. Sometimes you have to look beyond the obvious to find the right language, where the right structure makes the difference while offering that extra feature to help your code run significantly faster without endless tweaking and optimizing. This language produces vastly more stable and accurate code because it prevents you from programming sloppy or wrong code.

The world is filled with thousands of clever languages that aren’t C#, Java, or JavaScript. Some are treasured by only a few, but many have flourishing communities connected by a common love for the language’s facility in solving certain problems. There may not be tens of millions of programmers, who know the syntax, but sometimes there is value in doing things a little different, as experimenting with any new language can pay significant dividends on future projects.

The following six languages should be on every programmer’s radar. They may not be the best for every job — many are aimed at specialized tasks. But they all offer upsides that are worth investigating and investing in. There may be a day when one of these languages proves to be exactly what your project — or boss — needs.

Erlang: Functional programming for real-time systems

Erlang’s secret is the functional paradigm. Most of the code is forced to operate in its own little world where it can’t corrupt the rest of the system through side effects. The functions do all their work internally, running in little “processes” that act like sandboxes and only talk to each other through mail messages. You can’t merely grab a pointer and make a quick change to the state anywhere in the stack. You have to stay inside the call hierarchy. It may require a bit more thought, but mistakes are less likely to propagate.

The model also makes it simpler for runtime code to determine what can run at the same time. With concurrency so easy to detect, the runtime scheduler can take advantage of the very low overhead in setting up and ripping down a process. Erlang fans like to flourish about running 20 million “processes” at the same time on a Web server.

If you’re building a real-time system with no room for dropped data, such as a billing system for a mobile phone switch, then check out Erlang.

Go: Simple and dynamic

Google wasn’t the first organization to survey the collection of languages, only to find them cluttered, complex, and often slow. In 2009, the company released its solution: a statically typed language that looks like C but includes background intelligence to save programmers from having to specify types and juggle malloc calls. With Go, programmers can have the terseness and structure of compiled C, along with the ease of using a dynamic script language.

While Sun and Apple followed a similar path in creating Java and Swift, respectively, Google made one significantly different decision with Go: The language’s creators wanted to keep Go “simple enough to hold in one programmer’s head.Thus, there are few zippy extras like generics, type inheritance, or assertions, only clean, simple blocks of if-then-else code manipulating strings, arrays, and hash tables.

The language is reportedly well-established inside of Google’s vast empire and is gaining acceptance in other places where dynamic-language lovers of Python and Ruby can be coaxed into accepting some of the rigor that comes from a compiled language.

If you’re a startup trying to catch Google’s eye and need to build some server-side business logic, Go is a great place to start.

Groovy: Scripting goodness for Java

The Java world is surprisingly flexible. Say what you will about its belts-and-suspenders approach, like specifying the type for every variable, ending every line with a semicolon, and writing access methods for classes that simply return the value. But it looked at the dynamic languages gaining traction and built its own version that’s tightly integrated with Java.

Groovy offers programmers the ability to toss aside all the humdrum conventions of brackets and semicolons, to write simpler programs that can leverage all that existing Java code. Everything runs on the JVM. Not only that, everything links tightly to Java JARs, so you can enjoy your existing code. The Groovy code runs like a dynamically typed scripting language with full access to the data in statically typed Java objects. Groovy programmers think they have the best of both worlds. There’s all of the immense power of the Java code base with all of the fun of using closures, operator overloading, and polymorphic iteration.

Finally, all of the Java programmers who’ve envied the simplicity of dynamic languages can join the party without leaving the realm of Java.

CoffeeScript: JavaScript made clean and simple

Technically, CoffeeScript isn’t a language. It’s a preprocessor that converts what you write into JavaScript. But it looks different because it’s missing plenty of the punctuation. You might think it is Ruby or Python, though the guts behave like JavaScript.

CoffeeScript began when semicolon haters were forced to program in JavaScript because that was what Web browsers spoke. Changing the way the Web works would have been an overwhelming task, so they wrote their own preprocessor instead. The result? Programmers can write cleaner code and let CoffeeScript turn it back into the punctuation-heavy JavaScript Web browsers demand.

Missing semicolons are only the beginning. With CoffeeScript, you can define a function without typing function or wrapping it in curly brackets. In fact, curly brackets are pretty much nonexistent in CoffeeScript. The code is so much more concise that it looks like a modernist building compared to a Gothic cathedral. This is why many of the newest JavaScript frameworks are often written in CoffeeScript and compiled.

Haskell: Functional programming, pure and simple

For more than 20 years, the academics working on functional programming have been actively developing Haskell, a language designed to encapsulate their ideas about the evils of side effects. It is one of the purer expressions of the functional programming ideal, with a careful mechanism for handling I/O channels and other unavoidable side effects. The rest of the code, though, should be perfectly functional.

The community is very active, with more than a dozen variants of Haskell waiting for you to explore. Some are stand-alone, and others are integrated with more mainstream efforts like Java (Jaskell, Frege) or Python (Scotch). Most of the names seem to be references to Scotland, a hotbed of Haskell research, or philosopher/logicians who form the intellectual provenance for many of the ideas expressed in Haskell. If you believe that your data structures will be complex and full of many types, Haskell will help you keep them straight.

Julia: Bringing speed to Python land

The world of scientific programming is filled with Python lovers who enjoy the simple syntax and the freedom to avoid thinking of gnarly details like pointers and bytes. For all its strengths, however, Python is often maddeningly slow, which can be a problem if you’re crunching large data sets as is common in the world of scientific computing. To speed up matters, many scientists turn to writing the most important routines at the core in C, which is much faster. But that saddles them with software written in two languages and is thus much harder to revise, fix, or extend.

Julia is a solution to this complexity. Its creators took the clean syntax adored by Python programmers and tweaked it so that the code can be compiled in the background. That way, you can set up a notebook or an interactive session like with Python, but any code you create will be compiled immediately.

The guts of Julia are fascinating. They provide a powerful type inference engine that can help ensure faster code. If you enjoy metaprogramming, the language is flexible enough to be extended. The most valuable additions, however, may be Julia’s simple mechanisms for distributing parallel algorithms across a cluster. A number of serious libraries already tackle many of the most common numerical algorithms for data analysis.

The best news, though, may be the high speeds. Many basic benchmarks run 30 times faster than Python and often run a bit faster than C code. If you have too much data but enjoy Python’s syntax, Julia is the next language to learn.

Polina Mikhan

Polina Mikhan
Polina.Mikhan@altabel.com 
Skype ID: poly1020
Business Development Manager (LI page)
Altabel Group – Professional Software Development

 

WHAT

In today’s business and technology world you can’t have a conversation without touching upon the issue of big data. Some would say big data is a buzzword and the topic is not new at all. Still from my point of view recently, for the last two-three years, the reality around the data has been changing considerably and so it makes sense to discuss big data so hotly. And the figures prove it.

IBM reports we create 2.5 quintillion bytes of data every day. In 2011 our global output of data was estimated at 1.8 billion terabytes. What impresses it that 90 percent of the data in the world today was created in the past two years according to Big Blue. In the information century those who own the data and can analyze it properly and then use it for decision-making purpose will definitely rule the world. But if you don’t have the tools to manage and perform analytics on that never-ending flood of data, it’s essentially garbage.

Big data is not really a new technology, but a term used for a handful of technologies: analytics, in-memory databases, NoSQL databases, Hadoop. They are sometimes used together, sometimes not. While some of these technologies have been around for a decade or more, a lot of pieces are coming together to make big data the hot thing.

Big data is so hot and is changing things for the following reasons:
– It can handle massive amounts of all sorts of information, from structured, machine-friendly information in rows and columns toward the more human-friendly, unstructured data from sensors, transaction records, images, audios and videos, social media posts, logs, wikis, e-mails and documents,
– It works fast, almost instantly,
– It is affordable because it uses ordinary low-cost hardware.

WHY NOW

Big data is possible now because other technologies are fueling it:
-Cloud provides affordable access to a massive amount of computing power and to loads of storage: you don’t have to buy a mainframe and a data center, and pay just for what you use.
-Social media allows everyone to create and consume a lot of interesting data.
-Smartphones with GPS offer lots of new insights into what people are doing and where.
-Broadband wireless networks mean people can stay connected almost everywhere and all the time.

HOW

The majority of organizations today are making the transition to a data-driven culture that leverages data and analytics to increase revenue and improve efficiency. For this a complex approach should be taken, so called MORE approach as Avanade recommends:
-Merge: to squeeze the value out of your data, you need to merge data from multiple sources, like structured data from your CRM and unstructured data from social news feeds to gain a more holistic view on the point. The challenge here is in understanding which data to bring together to provide the actionable intelligence.
-Optimize: not all data is good data, and if you start with bad data, with data-driven approach you’ll just be making bad decisions faster. You should identify, select and capture the optimal data set to make the decisions. This involves framing the right questions and utilizing the right tools and processes.
-Respond: just having data does mean acting on it. You need to have the proper reporting tools in place to surface the right information to the people who need it, and those people then need the processes and tools to take action on their insights.
-Empower: data can’t be locked in silos, and you need to train your staff to recognize and act on big data insights.

And what is big data for your company? Why do you use it? And how do you approach a data-driven decision-making model in your organization?

Would be interesting to hear your point.

Helen Boyarchuk

Helen Boyarchuk
Helen.Boyarchuk@altabel.com
Skype ID: helen_boyarchuk
Business Development Manager (LI page)
Altabel Group – Professional Software Development

Data is something that companies grapple with every day – after all, we are in the era of Big Data. How to gather it, analyze it and interpret it. But one important part of dealing with data is figuring out how and where to store it. Below are 10 things to think about when choosing the right data storage technologies for your enterprise or project.

1. Consider all your options.

Relational databases may still be dominant, but their hold has slipped. While they have been a successful, leading data storage technology for 20 years, IT architects have been challenged by the impedance mismatch between the relational model and the in-memory data structures, and the unstructured nature of the data. Now, there is a movement away from using databases as integration points as the need to support large volumes of data by running on clusters results in a change in data storage. Relational databases still provide advantages and, for now, will continue to be used in most cases. However, multiple database options are now available depending on the nature of the data stored and how it will be manipulated.

2. How big is your data?

When evaluating data storage technologies, it’s important to know how much data you’re dealing with and in what format. With organizations grappling with massive amounts of unstructured data, a new data storage technology has emerged as “king” of Big Data, NoSQL. The growing need for rapid access to lots of unstructured data has led to the growing use of NoSQL databases, which process large volumes of data on clusters of machines more efficiently than relational databases.

3. If developer productivity and large-scale data are your pain points, NoSQL may be a good choice.

NoSQL is generally applied to a number of recent non-relational databases such as Cassandra, Mongo and Riak. The common characteristics of NoSQL databases include:

– Not using the relational model
– Running well on cluster
– Open-source
– Built for 21st century web estates
– Schemaless
– Horizontally scalable
The two main reasons for using NoSQL technology are to improve programmer productivity by using a database that better matches an application’s needs and to improve data access performance via some combination of handling larger data volumes, reducing latency, and improving throughput.

4. Different business problems need different solutions.

Storing user activity on websites is totally different than finding out which of your users is most connected to other users or dealing with huge write volumes such as capturing live stream of data. These different problems need different solutions. IT architects should make sure to understand the problem and choose the right solution before making the default choice.

5. If you’re working with NoSQL databases, consider the data model types.

There is a common approach to categorizing NoSQL databases according to their data models. These include:

* Key-Value Databases – Key-value stores are simple hash tables, primarily used when all access to the database is via a primary key. These are the simplest NoSQL data stores to use from an API perspective. Some of these databases include: Riak, Redis or MemcachedDB.
* Document Databases – Document Databases store and retrieve documents. These are self-describing, hierarchical tree data structures, which can consist of maps, collections, and scalar values. Some of these databases include MongoDB, CouchDB, Terrastore and others.
* Column-Family Stores – Column family stores, such as Cassandra, HBase and Amazon SimpleDB, allow you to store data with keys mapped to values and the values grouped into multiple column families, each column family being a map of data.
* Graph Databases – Graph databases such as, Neo4J, Infinite Graph or OrientDB, allow you to store entities, also known as nodes, and relationships between these entities

6. Scale solutions to suit growth of data.

The rate of growth of data is no longer predictable. Gone are the days when we could plan for three-year cycles to upgrade hardware and do capacity planning. NoSQL allows scaling for performance and volume without any downtime by allowing expansion of clusters transparently.

7. You may need more than one data storage technology.

The most important outcome of the rise of NoSQL is the acceptance of database technologies beyond relational databases. However, NoSQL is only one set of data storage technologies, and other data storage technologies should be considered whether or not they bear the NoSQL label. Other options include file systems, event sourcing, memory image, version control, XML databases and object databases. This has led to a new era of “Polyglot Persistence.”

Polyglot persistence is about using different data storage technologies to handle varying data storage needs. It can apply across an enterprise or within a single application.

8. NoSQL solutions can be introduced in existing applications.

In existing applications, functions that don’t need relational databases such as searching, indexing content, relationship between customers and products, can be moved to use NoSQL solutions allowing the applications to scale and react to emerging customer needs.

9. Remember to consider the complexities.

Employing more data storage technologies increases complexity in programming and operations, so the advantages of a good data storage fit must be weighed against this complexity before moving forward with a specific technology.

10. Experiment!

Only by working with NoSQL and others – and discovering their strengths and weaknesses – can IT architects understand these new data storage technologies. In the future, organizations will use many data technologies. Data professionals will need to be familiar with these different approaches and know how to match them to different problems. When you introduce different data storage technologies, you will need to think about new ways of data modeling, data consistency, and evolution.

Learning the concepts is an important first step, but to really understand multiple storage technologies, you’ll need to get the experience of building representative systems using them.

Best Regards,

Kristina Kozlova

Marketing Manager

 

altabel

Altabel Group

Professional Software Development

E-mail: contact@altabel.com
www.altabel.com

The value of a lean start-up approach is that you are not heavily investing upfront in unnecessary/unneeded expenses. Your budget/funds should be allocated toward developing a prototype/product to test against a small/large group and see whether or not your target audience love it or hate it. This will give you a more accurate idea of its potential value, cost to improve the product/market, and maybe a couple of example customers.

The Lean Startup has evolved into a movement that is having a significant impact on how companies are built, funded and scaled. As with any new idea, with popularity comes misinterpretation:

Tale 1: Lean means cheap. Lean startups try to spend as little money as possible
The reality is the Lean Startup method is not about cost, it is about speed. Lean startups waste less money, because they use a disciplined approach to testing new products and ideas. Lean, when used in the context of lean startup, refers to a process of building companies and products based on lean manufacturing principles, but applied to innovation. That process involves rapid hypothesis testing, learning about customers, and a disciplined approach to product development.

Tale 2: The Lean Startup methodology is only for Web 2.0, Internet and consumer software companies
Actually, the Lean Startup methodology applies to all companies that face uncertainty about what customers will want. This is true regardless of industry or even scale of company: many established companies depend on their ability to create disruptive innovation. Those general managers are entrepreneurs, too. And they can benefit from increased speed and discipline.

Tale 3: Lean Startups are bootstrapped startups
There’s nothing wrong with raising venture capital. Many lean startups are ambitious and are able to deploy large amounts of capital. What differentiates them is their disciplined approach to determining when to spend money: after the fundamental elements of the business model have been empirically validated. Because lean startups focus on validating their riskiest assumptions first, they sometimes charge money for their product from day one – but not always.

Tale 4: Lean Startups are very small companies
This focus on size also obscures another truth: that many entrepreneurs live inside of much larger organizations. The proper definition of a startup is: a human institution creating a new product or service under conditions of extreme uncertainty. In other words, any organization striving to create disruptive innovation is a startup, whether they know it or not. Established companies have as much to gain from lean startup techniques as the mythical “two guys in a garage”.

Tale 5: Lean Startups replace vision with data or customer feedback
Lean startups are driven by a compelling vision, and they are rigorous about testing each element of this vision against reality. They use customer development, split-testing, and in-depth analytics as vehicles for learning about how to make their vision successful. Along the way, they pivot away from the elements of the vision that are delusional and double down on the elements that show promise.
The old model of entrepreneurship was dominated by an over-emphasis on the magical powers of startup founders. Usually, the stories we hear about successful startups describe a brilliant visionary, fighting valiantly against the odds to create a new reality. As employees gradually fall under his or her spell, they execute his or her master plan, which leads, in the end, to world domination.
Anyone who has spent time around real startup successes knows this story is usually wildly untrue. Founders benefit from historical revisionism and survivor’s bias: we rarely hear the stories of the thousands of visionaries who failed utterly.
The Lean Startup moves our industry past this mythological entrepreneurship story and towards a methodology that is more scientifically grounded and accessible.
People who are truly committed to a vision of changing the world in a significant way can’t afford the luxury of staying in that cozy, comfortable place of building in stealth mode without outside feedback. If you really believe your vision needs to become a reality, you owe it to yourself to test that vision with every tool available.

Best Regards,

Kristina Kozlova

Marketing Manager

 

altabel

Altabel Group

Professional Software Development

E-mail: contact@altabel.com
www.altabel.com


%d bloggers like this: