Browsing:

Tag: Microsoft

Artificial Intelligence prepping data – Back to basic part 1

What street legal cars taught me about Machine learning? It's all about the right data being available and the RDW data can't be trusted!

I impulsively signed up for an Artificial Intelligence certification track 2 months ago, So I've been experimenting with Artificial Intelligence for a while now and in the beginning of the course it was one though cookie! Those formula's to interpreted the data predictability really freaked me out!

But once I got past the formula's and I saw the resemblance of the workspace with Microsoft products like SSIS and BI I see endless possibilities. This takes the data to a whole new level.

Preparing the data:

I did a test on all cars that are currently on the road in the Netherlands and combined it with performance data. I wanted to find the fastest street legal car. I guess I just wanted to find out what kind of cars I should fancy these days according to the performance stats.

I used an open data set from the dutch RWD (Driver and Vehicle Standards Agency). It contained 14m rows and it's 7GB in size. So I had to prepare the data in order to keep the experiment basic and performance high. I imported it into my SQL server and I filtered out the the stationwagons, campers, scooters and trailers, So I was left with a 900000  rows data set.

I use a SQL Server 2017 and the Microsoft Azure machine learning studio to create a new experiment.

In order to make a prediction I needed to combine the brand data with the engine displacement data, because horsepower data was not available, to see which models are high performance based on the engine capacity. So sadly the smaller engines which are supercharged are not correctly represented in the prediction.

The calculation based on above rules, took a local SQL server on an i5 laptop about 15 minutes. I needed more data preparation.

Based on engine displacement, a top 3 came up. But I didn't like the results at all. Sure, the engine displacement was high, but the cars are heavy and their performance isn't the best. Super charged Turbo's and gearing make all the difference, but aren't properly represented in this data result.

I had to filter out a lot of data, next up I added the weight of the car, but it wasn't trust worthy either. I found a data set which contained the Kw of the cars and top speed and joined the data with my current results and added a calculation in SQL on the Kw row * 1,362 to calculate the Hp of the car. The Hp outcome looks pretty accurate.  After 4 hours of combining data and filtering the queries I gave up. Based on this data there is no way you can truly point out the fastest cars. I had to change my plans. Too many uncertain variables to make a decent prediction and not even close to the start of an IA project 🙁

Lot's of NULL data
This No. 1 car can't be trusted!

After more data crunching, The results are still not really worth to display. So here is a TOP 21 of "fastest" cars...based on...well the obvious HP and Weight sorting:

btw, did you know there is only one Koenigsegg on the dutch roads.

Ok, I got a little bit carried away with data prepping.

Now let's import it into an IA experiment: First you need to create a resource in the Azure Portal for your workspace. I won't get into details, we did this before!

Verify that you created the following new resources: A Machine Learning Workspace, A Machine Learning Plan and A Storage Account.

Browse to the Machine learning workspace you created and launch Machine Learning Studio. This opens a new browser page.

Go to experiments and down in the left  corner click NEW.

Rename the experiment and add a dataset. Upload a new dataset. Datasets --> NEW--> Select data to upload. Now that you have the dataset ready, you can drag it into your experiment and start running tests and variables on the data.

In my next post we will dive deeper into Artificial Intelligence

 

 

 


PaaS taking over the world! Are dba’s a dying breed?

PaaS vs IaaS

 

It's easter weekend, or as the dutch say 'PaaS weekend'. So, it might be a good idea to talk a bit about PaaS. What is it and what does this mean for you as a Database administrator? Are Dba's a dying breed. Will they just shift over to more complex or broader tasks or are they here to stay. It all depends on the company and the software they are running.

 

 

 

What is #Iaas? in short, it's a VM. Your database is running in a data center. As a Dba you have the same job requirements as when running a database on-premise. Update, backup, patches, tuning, security and account control.

But #PaaS is a different story, it runs the database as a service, so there is no need for a dba. At least that's what they tell you. But does PaaS solve all your performance and tuning needs, is that faulty query when moved to PaaS suddely solved? Nope. Machine learning is doing a great job so far, but it isn't the magical quick fix, yet.

In all honesty most companies don't care about tuning a database, not all applications have complex queries and tasks running on their SQL server, most are fine running an express edition. They don't even bother having a Dba. The database is taken care of by a system engineer, if being looked at, at all.

Where does this put you as a Dba? Don't sob, we still need you! It's a big relief that the market, which is still flooded with old school high maintenance MS SQL driven apps. How nice if these could be taken care of with PaaS. The only good thing coming out of these high maintenance, splintered databases, is the data itself. Do you really want to spend your time updating, patching and granting rights to users and saying no to SA account requests? No, you don't!

But second, in the real world, companies don't evolve as fast as the IT world itself. Most applications, lots outdated or not are not being replaced overnight and not all the vendors are quite ready for PaaS environments with their applications. So you just have to decide on which side you want to be, fast IT or slow IT. There is still a big playground available for both for the coming years.


Creating a linked server ´MySQL to MSSQL´(query the MySQL database without openquery function)

In addition to my previous linked server tutorials, I decided it is time to add MySQL to the linked server series.
In order to have the bug tracking application, Mantis migrated from linux and have it run on a windows environment, I wanted to create a replication between SQL2008 and MySQL, but then I thought, why not try out a virtual linked server again first, to test Mantis isntallation on a Windows based installation, since the online promise of Mantis on a MSSQl environment is not very promising. So today we will create a linked server from MySQl to MSSQL on a windows 2008R 64 bit environment.

Create DNS for MySQL

In order to do so, We first need to install the correct drivers in order to create a ODBC DSN, Just download the drivers from Mysql developers site http://dev.mysql.com/downloads/connector/odbc/ and install them to your database server. DriversIf we see the listed drivers, it means we can create a new DSN, so open up the System DSN tab and ADD a new DSN, you must fill in the correct credentials, for example:ODBC_Connector

Data Source Name: Enter a describing name, so you can see what it does, you might have more linked servers or other connectors running on the same server.
Description: this isnt maditory, but if you want to be more specific, be my guest.
Server: in my case,it's localhost, as this is a test server and MSSQL and MySQl are on the same server.
Insert username and password, when this is done. The database will display the possible databases you can connect to, in the dropdown.

Click OK and as you can see the System DSN has been added to ODBC.

Create new Linked Server

When this is done, it's time to open up the MSSQl server and add a new linked server to the Server Objects.4_create_LInkedsname your linked server, I give it the same name as the SystemDSN. And choose the correct provider: Microsoft OLE DB Provider for ODBC Drivers. and datasource equels DSN name.
You need to fill in all the credentials for the provider string, for example:

DRIVER=(MySQL ODBC 5.2 ANSI Driver);SERVER=localhost;PORT=3306;DATABASE=mantisbt; USER=user;PASSWORD=password;OPTION=3;

Note: meaning of OPTION=3 in the MySQL connection string:
Option=1 FLAG_FIELD_LENGHT: Do not Optimize Column Width
Option=2 FLAG_FOUND_ROWS: Return matching rows
Option=3 option 1 and 2 together

Now click OK, this is always the most fun part to me! when it says connection tot the linked server succeeded!

In addition to this, you can enable provider options on the SQLOLEDB, In my case I select the Dynamic Parameter and Allow inprocess.

Now, lets run the test and see if it connects with the databases, as you can see, it connects all the databases available on the MySQL server.

Connection test

But, most important, we can query it directly. Wheeee!

Linked server without OpenQuery function (Tip!)

Maybe you have read other MySQL linked server tutorials before this one and found out that you could only query the mysql database using the openquery() function or maybe that IS  the reason it brought you to this site. Extra, as in extra work, is never fun! With the correct ODBC driver and the right provider options, you can query the MySQl database, just like any other MSSQL database on your MSSQL server. Just follow the tutorial above and don't forget to  enable the correct provider options. Cheers!


Total cloud control together with Oracle and Microsoft

oraclems

Total cloud control - "Reduce downtime" and "pluggable database"

 

This is what the new oracle rdbms 12c is all about; It's a business driven enterprise cloud management solution.

Two features highlighted:

Pdb pluggable database features: one instance for multiple databases saves lots of memory and you can upgrade multiple database instances at once.

Multitenancy* (the key ingredient for the cloud)  the pro's of the  shared server technology is it's efficiency with the guarantee of data isolation and discrete tenant performance management. db admins can discretely manage service levels and define resource  allocation and priority.

Oracle also promises 12c's database disk I/O is reduced and uses fair memory usage which was not really the case with the absorbing powers of current popular oracle 11g, which only works well on a powerful x64.

The multitenancy* refers to the already growing SAAS community (software as a service) where many customers share the same application instance, but with separated data. Oracle pushes this approach from the application to the database!

So whats the fuzz about?

Oracle and Microsoft announced their partnerships along with some other (saas) companies today with software vendors that will, I quote "Reshape the cloud and reshape the perception of oracle technology in the cloud".

Partnerships are no big news in the ever changing IT world, but Microsoft and Oracle being database competitors for a long time, this is big news. This could not only bring together the best of both worlds. But will this change the way we think an work with databases? Will oracle's 12c become the big force behind the SQl Azure cloud services? I can't wait to hear more about this!

I'm curious to hear your thoughts on this partnership.


Tooltip for this month SSMA for MySQL

Since Sun MicroSystems has been aquired by Oracle, Some people have had their doubts about the future of MySQL, ofcourse there is MariaDB, a fairly new MySQL solution founded by the masterbrain behind MySQL, Monthy. But it just doesn't do it 'yet' for the bigger audience, i never heard a client ask for a MariaDB database solution, the questions are always the same, Are we going to use Oracle or Microsoft and what are the organization costs to implement and support?

SQL 2008 Express

If you are a personal or a non heavy user of databases Microsoft released the MSSQL Express edition, a free SQL database version that supports databases up to 10 GB. Read more about the free Microsoft SQL 2008 R2 Express here. And to make it even easier for you, Microsoft has now released the MySQL migration Assistent (SSMA) at the 12th of August. The SSMA tool is available for servers SQL2005, 2008, 2008 R2 and SQL Azure and is compatible with MySQL version 4.2 and higher, also suitable for 64 bit platforms. So what's keeping you up? For everyone who still has doubts, Microsoft will now convince you. After a SQL migration assistant for Oracle, Access and SyBase, MySQl has been added to this list. A pretty smart move for Microsoft to generate even more new clients, who still had cold feet, thinking about costs and downtime of the servers, there is no longer an excuse. Microsoft is listening, improving and on the move!