Mac N Cheese

Mac and Cheese
A picture of the finished macaroni and cheese dish

Mac'n'Cheese is a favourite dish, but the one place that I posted my recipe is gone now. Let's see if I can recreate it.

Inspired by an episode of Bones, I make my Mac'n'Cheese with leeks and pancetta now. For a vegetarian version, use your favourite vegie bacon, sprinkled with nutmeg and cinnamon while frying.

Bring 4 quarts of water to a boil, add a big handful of your favourite sea salt, and cook 1 pound of Rustichella d'Abruzzo penne for 8 minutes [two minutes less than the minimum recommended cooking time. Drain and set aside.

Clean by cutting off the roots and green part, and soaking the white part in salted cold water, and thinly slice two medium or one large leek(s) and sweat in 3 tablespoons sweet butter with freshly ground rainbow peppercorns until translucent. Salt to taste. Alternately, sweat in the pancetta grease or the fat in which you sautéed the vegie bacon.

Slowly add three flat tablespoons of flour and stir for two or three minutes to make a roux.

Slowly pour in three cups of milk to make a bechamel like sauce. Cube and then stir in one-half pound of [raw milk, if you can find it] asiago, one-half pound of fontinal [the Italian Fontinal, not the Danish Fontina] and on-half pound of monterey jack cheeses, until melted. Add one cup of heavy cream. Other variations may use a bit of mustard powder or seeds, Sierra Nevada mustard with stout, a few drops of Worcestershire sauce [remember it has anchovies], pesto, or any of a number of tapenades.

Add the cooked pasta to the cheese sauce and pour into a buttered glass lasagna or casserole dish.

Grate one-quarter pound of good quality parmigiano reggiano, and mix with one-half cup of fresh bread crumbs and the sautéed pancetta or vegie bacon. For the bread crumbs, I often make tiny cubes of whatever left-over bread I have around, soak in milk, squeeze nearly dry, and then add the cheese and savory. Sprinkle over the mac'n'cheese and dot with more sweet butter.

Bake at ~350ºF for 30 minutes or more, until the sauce is bubbling up around the edges and the topping is lightly browned.

And remember, recipes are guidelines, not rules. Experiment. Try different cheeses, sharper, milder, mixed. Add other stuff. Make the dish yours.

Comment to BBBT Blog on Wherescape

Today started for me with a great Bouder BI Brain Trust [BBBT] Session featuring WhereScape and their launch of WhereScape 3D [registration or account required to download], their new data warehouse planning tool. Other than my interest in all things related to data management and analysis [DMA], the WhereScape 3D tool is particularly interesting to me in its potential for use in Agile environments and its flexibility in being used with other data integration tools, not just WhereScape Red. Richard Hackathorn does a great job describing WhereScape 3D, which launched in beta at the BBBT, complete with cake for those in the room, which he's already downloaded and used. [I'm awaiting the promised cross-platform JAR to try it out on my MacBookPro.]

Unfortunately, Twitter search is letting me down today as I normally gather all the #BBBT tweets from a session, send them to Evernote, and check these "notes" as I write a blog post.

WhereScape 3D is a planning tool, allowing a data warehouse developer to profile source systems, model the data warehouse or data mart, and automagically create metadata driven documentation. Further, one can iterate through this process, creating new versions of the models and documentation, without destroying the old. The documentation can be exported as HTML and included in any web-based collaboration platform. So, there is the potential of using the documentation against Scrum style burn down lists and for lightweight Agile artifacts.

WhereScape 3D and Red come with a variety of ODBC drivers, and, with the proper Teradata licensing, the Teradata JDBC driver as well. One can also add other ODBC and JDBC drivers. However, neither WhereScape product currently allows connections to non-relational database sources. I would find this to be severely limiting, as in traditional enterprises, we've never worked on a DMA project that didn't include legacy systems requiring us to pull from flat files, systems written in Pick Basic against a UniVerse or other multi-value database management system [MVDBMS], electronic data interchange [EDI] files, XML, or java or ReSTful services. In other cases, we're facing new data science challenges of extreme volumetric flows of data from web, sensor and transaction logs, requiring real-time analytics, such as can be had with SQLstream, or stored in NoSQL data sources, such as Hadoop and its offshoots.

Which leads us to another interesting feature of WhereScape 3D: it's designed to be used with any data integration tool, not just WhereScape Red. I'm looking forward to get that JAR file, currently hiding in a MS Windows EXE file, and trying WhereScape 3D in conjunction with Pentaho Data Integration [PDI or KETTLE] and seeing how the nimble nature of WhereScape 3D planning works with PDI Spoon AgileBI against all sorts of data flows targeting LucidDB ADBMS and data vault. Yeehah!

Full360 on BBBT

Today, Friday the 13th of May, 2011, the Boulder BI Brain Trust heard from Larry Hill [find @lkhill1 onTwitter] and Rohit Amarnath [find @ramarnat on Twitter] of Full360 [find @full360 on Twitter] about the company's elasticBI™ offering.

Serving up business intelligence in the Cloud has gone through the general hype cycles of all other software applications, from early application service providers (ASP), through the software as a service (SaaS) pitches to the current Cloud hype, including infrastructure and platform as a service (IaaS and PaaS). All the early efforts have failed. To my mind, there have been three reasons for these failures.

  1. Security concerns on the part of customers
  2. Logistics difficulties in bringing large amounts of data into the cloud
  3. Operational problems in scaling single-tenant instances of the BI stack to large number of customers

Full360, a 15-year-old system integrator & consultancy, with a clientele ranging from startups to the top ten global financial institutions, has come up with a compelling Cloud BI story in elasticBI™, using a combination of open source and proprietary software to build a full BI stack from ETL [Talend OpenStudio as available through Jaspersoft] to the data mart/warehouse [Vertica] to BI reporting, dashboards and data mining [Jaspersoft partnered with Revolution Analytics], all available through Amazon Web Services (AWS). Full360 is building upon their success as Jaspersoft's primary cloud partner, and their involvement in the Rightscale Cloud Management stack, which was a 2010 winner of the SIIA CODiE award, with essentially the same stack as elasticBI.

Full360 has an excellent price point for medium size businesses, or departments within larger organizations. Initial deployment, covering set-up, engineering time and the first month's subscription, comes to less than a proof of concept might cost for a single piece of their stack. The entry level monthly subscription extended out for one year, is far less than an annual subscription or licensing costs for similar software, considering depreciation on the hardware, and the cost of personnel to maintain the system, especially considering that the monthly fee includes operations management and a small amount of consulting time, this is a great deal for medium size businesses.

The stack being offered is full-featured. Jaspersoft has, arguably, the best open source reporting tool available. Talend Open Studio is a very competitive data integration tool, with options for master data management, data quality and even an enterprise service bus for complete data integration from internal and external data sources and web services. Vertica is a very robust and high-performance column-store Analytic Database Management System (ADBMS) with "big data" capabilities that was recently purchased by HP.

All of this is wonderful, but none of it is really new, nor a differentiator from the failed BI services of the past, nor the on-going competition today. Where Full360 may win however, is in how they answer the three challenges that caused the failure of those past efforts.

Security

Full360's elasticBI™ handles the security question with the answer that they're using AWS security. More importantly, they recognized the security concerns as one of their presentation sections today stated, "Hurdles for Cloud BI" being cloud security, data security and application security. All three of these being handled by AWS standard security practices. Whether or not this is suficient, especially in the eyes of customers, is uncertain.

Operations

Operations and maintenance is one area where Full360 is taking great advantage of the evolution of current Cloud services best known methods and "devops" by using Chef opscode recipes for handling deployment, maintenance, ELT and upgrades. However, whether or not this level of automation will be sufficient to counter the lack of a multi-tenant architecture remains to be seen. There are those that argue that true Cloud or even the older SaaS differentiators and ability to scale profitably at their price-points, depends on multi-tenancy, which causes all customers to be at the same version of the stack. The heart of providing multi-tenancy is in the database, and this is the point where most SaaS vendors, other than salesforce-dot-com (SFDC), fail. However, Jaspersoft does claim support for multi-tenant architecture. It may be that Full360 will be able to maintain the balance between security/privacy and scalability with their use of devops, and without creating a new multi-tenant architecture.Also, the point of Cloud services isn't the cloud at all. That is, the fact that the hardware, software, platform, what-have-you is in a remote or distributed data center isn't the point. The point is the elastic self-provisioning. The ability of the customer to add resources on their own, and being charged accordingly.

Data Volume

The entry-level data volume for elacticBI™ is the size of a departmental data mart today. But even today, successfully loading into the Cloud, that much data in a nightly ETL run, simply isn't feasible. Full360 is leveraging Aspera's technology for high-speed data transfer, and AWS does support a form of good ol' fashioned "sneaker net", allowing customers to mail in hard drives. In addition, current customers with larger data volumes, are drawing that data from the cloud, with the source being in AWS already, or from SFDC. This is a problem that will continue to be an "arms race" into the future, with data volumes, source location and bandwidth being in a three-way pile-up.

In conclusion, Full360 has developed an excellent BI Service to suplement their professional services offereings. Larger organizations are still wary of allowing their data out of their control, or may be afraid of the target web services provide for hackers, as exemplified by the recent bank & retailer email scammers, er marketing, and Sony break-ins. Smaller companies, which might find the price attractive enough to offset security concerns, haven't seen the need for BI. So, the question remains as to whether or not the market is interestd in BI in the Cloud.

This post was simultaneously published on the Blog of the Boulder BI Brain Trust, of which I'm a member.

Setting up the Server for OSS DSS

The first thing to do when setting up your server with open source solutions [OSS] for a decision support system [DSS] is to check all the dependencies and system requirements for the software that you're installing.

Generally, in our case, once you make sure that your software will work on the version of your operating system that you're running, the major dependency is Java. Some of the software that we're running may have trouble with openJDK, and others may require the Java software development kit [JDK or Java SDK], and not just the runtime environment [JRE]. For example, Hadoop 0.20.2 may have problems with openJDK, and versions before LucidDB 0.9.3 required the JDK. Once upon a time, two famous database companies would issue system patches that we're required for their RDBMS to run, but would break the other, forcing customers to have only one system on a host. A true pain for development environments.

Since I don't know when you'll be reading this, or if you're planning to use different software than I'm using, I'm just going to suggest that you check very carefully that the system requirements and software dependencies are fulfilled by your server.

Now that we're sure that the *Nix or Microsoft operating system that we're using will support the software that we're using, the next step is to set up a system user for each software package. Here's examples for a *Nix operating systems: Linux kernel 2.x derived and the BSD derived, MacOSX. I've tested this on Red Hat Enterprise Linux 5, OpenSUSE 11, MacOSX 10.5 [Leopard] and 10.6 [Snow Leopard].

On Linux, at the command line interface [CLI]:

useradd -c "name your software Server" -s /bin/bash -mr USERNAME
- c COMMENT is the comment field used as the user's full name
-s SHELL defines the login shell
-m create the home directory
-r create as a system user

Likely, you will need to run this command through sudo, and may need the full path:

/usr/sbin/useradd

Change the password

sudo passwd USERNAME

Here's one example, setting up the Pentaho system user.

poc@elf:~> sudo /usr/sbin/useradd -c "Pentaho BI Server" -s /bin/bash -mr pentaho
poc@elf:~> sudo passwd pentaho
root's password:
Changing password for pentaho.
New Password:
Reenter New Password:
Password changed.
phpoc@elf:~>

On the Mac, do the following

vate:~ poc$ sudo dscl /Local/Default -create /Users/_pentaho RealName "PentahoCE BI Server" UserShell /bin/bash
vate:~ poc$ sudo sudo passwd _pentaho
Changing password for _pentaho.
New Password:
Reenter New Password:
Password changed.
vate:~ poc$

On Windows you'll want to set up your server software as service, after the installation.

If you haven't already done so, you'll want to download the software that you want to use from the appropriate place. In many cases this will be Sourceforge. Alternate sources might be the Enterprise Editions of Pentaho, the DynamoBI downloads for LucidDB, SQLstream, SpagoWorld, The R-Project, Hadoop, and many more possibilities.

Installing this software is no different than installing any other software on your particular operating system:

  • On any system you may need to unpack an archive indicated by a .zip, .rar, .gz or .tar file extension. On Windows & MacOSX you will likely just double-click the archive file to unpack it. On *Nix systems, including MacOSX and linux, you may also use the CLI and a command such as gunzip, unzip, or tar xvzf
  • On Windows, you'll likely double-click a .exe file and follow the instructions from the installer.
  • On MacOSX, you might double-click a .dmg file and drag the application into the Applications directory, or you'll do something more *Nix like.
  • On Linux systems, you might, at the CLI, execute the .bin file as the system user that you set up for this software.
  • On *Nix systems, you may wish to install the server-side somewhere other than a user-specific or local Applications directory, such as /usr/local/ or even in a web-root.

One thing to note is that most of the software that you'll use for an OSS DSS uses Java, and that the latest Pentaho includes the latest Java distribution. Most other software doesn't. Depending on your platform, and the supporting software that you have installed, you may wish to point [softwareNAME]_JAVA_HOME to the Pentaho Java installation, especially if the version of Java included with Pentaho meets the system requirements for other software that you want to use, and you don't have any other compatible Java on your system.

For both security, and a to avoid any confusion, you might want to change the ports used by the software you installed from their defaults.

You may need to change other configuration files from their defaults for various reasons as well, though I generally find the defaults to be satisfactory. You may need to install other software from one package into another package, for compatibility or interchange. For example, if you're trying out, or if you've purchased, Pentaho Enterprise Edition with Hadoop, Pentaho provides Java libraries [JAR files]and licenses to install on each Hadoop node, including code that Pentaho has contributed to the Hadoop project.

Also remember that Hadoop is a top-level Apache project, and not usable software in and of itself. It contains subprojects that make it useful:

  • Hadoop Commons containing the utilities that support all the rest
  • HDFS - the Hadoop Distributed File System
  • MapReduce - the software framework for distributed processing of data on clusters

You may also want one or more of the other Apache subprojects related to Hadoop:

  • Avro - a data serialization system
  • Chukwa - a data collection system
  • HBase - a distributed database management system for structured data
  • Hive - a data warehouse infrastructure
  • Mahout - a data mining library
  • Pig - an high-level data processing language for parallelization
  • Zookeeper - a coordination service for distributed applicaitons

Welcome to the Data Archon

We've been blogging for over five years on data management & analytics open source solutions, collaboration, mobile, project management, Agile implementations and the like. In addition to these topics, we'll be discussing statistical, mathematical and computerized modeling, management and analysis of data.

December 2019
Mon Tue Wed Thu Fri Sat Sun
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
 << <   > >>
The TeleInterActive Press is a collection of blogs by Clarise Z. Doval Santos and Joseph A. di Paolantonio, covering the Internet of Things, Data Management and Analytics, and other topics for business and pleasure. 37.540686772871 -122.516149406889

Search

Categories

The TeleInterActive Lifestyle

Yackity Blog Blog

The Cynosural Blog

Open Source Solutions

DataArchon

The TeleInterActive Press

  XML Feeds

Mindmaps

Our current thinking on sensor analytics ecosystems (SAE) bringing together critical solution spaces best addressed by Internet of Things (IoT) and advances in Data Management and Analytics (DMA) is updated frequently. The following links to a static, scaleable vector graphic of the mindmap.

Recent Posts

free blog software