- General AAS
- Hardcover
- 19th Century
- James, Clive
- General AAS
- General AAS
- Behavioral Sciences
- Douglass, Frederick
- San Juan Islands
- Zimbabwe
- Brust, Steven
- Eckert, Allan
- Midwest
- Aalto, Alvar
- Nonfiction
- Yep, Lawrence
- Slovak
- Health Care Administration
- Phrasebooks - General
- General & Anthologies
- Sardinia
- Electricity Principles
- Radioactivity
- Holeman, Linda
- History
- Watches
- Home and Garden
- UK Electronics
- UK Books
- Health and Personal Care
- UK Sporting Goods
- Clothing, Shoes and Accessories
- Electronics, Gadgets and Computers
- CDs and Music Downloads
- UK Software and Video Games
- UK Toys and Games
- UK Home and Garden
- UK Video Games
- UK Baby Clothes and Accessories
- Books On
- German Electronics
Books : Computers & Internet : Databases : Data Mining
-
Summary
Machine Learning in Action is unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. You'll use the flexible Python programming language to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.
About the BookA machine is said to learn when its performance improves with experience. Learning requires algorithms and programs that capture data and ferret out the interesting or useful patterns. Once the specialized domain of analysts and mathematicians, machine learning is becoming a skill needed by many.
Machine Learning in Action is a clearly written tutorial for developers. It avoids academic language and takes you straight to the techniques you'll use in your day-to-day work. Many (Python) examples present the core algorithms of statistical data processing, data analysis, and data visualization in code you can reuse. You'll understand the concepts and how they fit in with tactical tasks like classification, forecasting, recommendations, and higher-level features like summarization and simplification.
Readers need no prior experience with machine learning or statistical processing. Familiarity with Python is helpful.
WhatLike the popular second edition, Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining'including both tried-and-true techniques of today as well as methods at the leading edge of contemporary research.
Complementing the book is a fully functional platform-independent open source Weka software for machine learning, available for free download.
The book is a major revision of the second edition that appeared in 2005. While the basic core remains the same, it has been updated to reflect the changes that have taken place over the last four or five years. The highlights for the updated new edition include completely revised technique sections; new chapter on Data Transformations, new chapter on Ensemble Learning, new chapter on Massive Data Sets, a new ?book release? version of the popular Weka machine learning open source software (developed by the authors and specific to the Third Edition); new material on ?multi-instance learning?; new information on ranking the classification, plus comprehensive updates and modernizaMining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, and Google+? This refreshed edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located. You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.
Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.
- Get a straightforward synopsis of the social web landscape
- Use adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+
- Learn how to employ easy-to-use Python tools to slice and dice the data you collect
- Explore social connections in microformats with the XHTML Friends Network
- Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
- Build interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits
"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data."
--Alex Martelli, Senior Staff Engineer, GoogleHow can you bring out MySQL’s full power? With High Performance MySQL, you’ll learn advanced techniques for everything from designing schemas, indexes, and queries to tuning your MySQL server, operating system, and hardware to their fullest potential. This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover.
Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL.
- Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and views
- Implement improvements in replication, high availability, and clustering
- Achieve high performance when running MySQL in the cloud
- Optimize advanced querying features, such as full-text searches
- Take advantage of modern multi-core CPUs and solid-state disks
- Explore backup and recovery strategies—including new tools for hot online backups
Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.
This revised edition covers recent changes to Hadoop, including new features such as Hive, Sqoop, and Avro. It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book.
- Use the Hadoop Distributed File System (HDFS) for storing large datasets, then run distributed computations over those datasets with MapReduce
- Become familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistence
- Discover common pitfalls and advanced features for writing real-world MapReduce programs
- Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud
- Use Pig, a high-level query language for large-scale data processing
- Analyze datasets with Hive, Hadoop’s data warehousing system
- Take advantage of HBase, Hadoop’s database for structured and semi-structured data
- Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems
"Now you have the opportunity to learn about Hadoop from a master -- not only of the technology, but also of common sense and plain talk."
--Doug Cutting, ClouderaAn international sensation—and still the talk of the relevant blogosphere—this Wall Street Journal and New York Times business bestseller examines the “power” in numbers. Today more than ever, number crunching affects your life in ways you might not even imagine. Intuition and experience are no longer enough to make the grade. In order to succeed—even survive—in our data-based world, you need to become statistically literate.
Cutting-edge organizations are already crunching increasingly larger databases to find the unseen connections among seemingly unconnected things to predict human behavior with staggeringly accurate results. From Internet sites like Google and Amazon that use filters to keep track of your tastes and your purchasing history, to insurance companies and government agencies that every day make decisions affecting your life, the brave new world of the super crunchers is happening right now. No one who wants to stay ahead of the curve should make another keystroke without reading Ian Ayres’s engrossing and enlightening book.Ready to unlock the power of your data? With this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN).
- Store large datasets with the Hadoop Distributed File System (HDFS)
- Run distributed computations with MapReduce
- Use Hadoop’s data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence
- Discover common pitfalls and advanced features for writing real-world MapReduce programs
- Design, build, and administer a dedicated Hadoop cluster—or run Hadoop in the cloud
- Load data from relational databases into HDFS, using Sqoop
- Perform large-scale data processing with the Pig query language
- Analyze datasets with Hive, Hadoop’s data warehousing system
- Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems
If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, this book shows you how Apache HBase can fulfill your needs. As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away.
- Discover how tight integration with Hadoop makes scalability with HBase easier
- Distribute large datasets across an inexpensive cluster of commodity servers
- Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs
- Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more
- Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs
- Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks
Implement a Robust BI Solution with Microsoft SQL Server 2012
Equip your organization for informed, timely decision making using the expert tips and best practices in this practical guide. Delivering Business Intelligence with Microsoft SQL Server 2012, Third Edition explains how to effectively develop, customize, and distribute meaningful information to users enterprise-wide. Learn how to build data marts and create BI Semantic Models, work with the MDX and DAX languages, and share insights using Microsoft client tools. Data mining and forecasting are also covered in this comprehensive resource.
- Understand the goals and components of successful BI
- Design, deploy, and manage data marts and OLAP cubes
- Load and cleanse data with SQL Server Integration Services
- Manipulate and analyze data using MDX and DAX scripts and queries
- Work with SQL Server Analysis Services and the BI Semantic Model
- Author interactive reports using SQL Server Data Tools
- Create KPIs and digital dashboards
- Use data mining to identify patterns, correlations, and clusters
- Implement time-based analytics
- Embed BI reports in custom applications using ADOMD.NET
These days it seems like everyone is collecting data. But all of that data is just raw information -- to make that information meaningful, it has to be organized, filtered, and analyzed. Anyone can apply data analysis tools and get results, but without the right approach those results may be useless.Author Philipp Janert teaches you how to think about data: how to effectively approach data analysis problems, and how to extract all of the available information from your data. Janert covers univariate data, data in multiple dimensions, time series data, graphical techniques, data mining, machine learning, and many other topics. He also reveals how seat-of-the-pants knowledge can lead you to the best approach right from the start, and how to assess results to determine if they're meaningful.
SQL (Structured Query Language) is a standard programming language for generating, manipulating, and retrieving information from a relational database. If you're working with a relational database--whether you're writing applications, performing administrative tasks, or generating reports--you need to know how to interact with your data. Even if you are using a tool that generates SQL for you, such as a reporting tool, there may still be cases where you need to bypass the automatic generation feature and write your own SQL statements.
To help you attain this fundamental SQL knowledge, look to Learning SQL, an introductory guide to SQL, designed primarily for developers just cutting their teeth on the language.
Learning SQL moves you quickly through the basics and then on to some of the more commonly used advanced features. Among the topics discussed:
- The history of the computerized database
- SQL Data Statements--those used to create, manipulate, and retrieve data stored in your database; example statements include select, update, insert, and delete
- SQL Schema Statements--those used to create database objects, such as tables, indexes, and constraints
- How data sets can interact with queries
- The importance of subqueries
- Data conversion and manipulation via SQL's built-in functions
- How condi
Is your data dragging you down? Are your tables all tangled up? Well we've got the tools to teach you just how to wrangle your databases into submission. Using the latest research in neurobiology, cognitive science, and learning theory to craft a multi-sensory SQL learning experience, Head First SQL has a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.
Maybe you've written some simple SQL queries to interact with databases. But now you want more, you want to really dig into those databases and work with your data. Head First SQL will show you the fundamentals of SQL and how to really take advantage of it. We'll take you on a journey through the language, from basic INSERT statements and SELECT queries to hardcore database manipulation with indices, joins, and transactions. We all know "Data is Power" - but we'll show you how to have "Power over your Data". Expect to have fun, expect to learn, and expect to be querying, normalizing, and joining your data like a pro by the time you're finished reading!Apply powerful window functions in T-SQL—and increase the performance and speed of your queries
Optimize your queries—and obtain simple and elegant solutions to a variety of problems—using window functions in Transact-SQL. Led by T-SQL expert Itzik Ben-Gan, you’ll learn how to apply calculations against sets of rows in a flexible, clear, and efficient manner. Ideal whether you’re a database administrator or developer, this practical guide demonstrates ways to use more than a dozen T-SQL querying solutions to address common business tasks.
Discover how to:
- Go beyond traditional query approaches to express set calculations more efficiently
- Delve into ordered set functions such as rank, distribution, and offset
- Implement hypothetical set and inverse distribution functions in standard SQL
- Use strategies for improving sequencing, paging, filtering, and pivoting
- Increase query speed using partitioning, ordering, and coverage indexing
- Apply new optimization iterators such as Window Spool
- Handle common issues such as running totals, intervals, medians, and gaps
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates.Why learn R? Because it's rapidly becoming the standard for developing statistical software. R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics.
The author introduces you to the R environment, including the R graphical user interface and console, and takes you through the fundamentals of the object-oriented R language. Then, through a variety of practical examples from medicine, business, and sports, you'll learn how you can use this remarkable tool to solve your own data analysis problems.
- Understand the basics of the language, including the nature of R objects
- Learn how to write R functions and build your own packages
- Work with data through visualization, statistical analysis, and other methods
- Explore the wealth of packages contributed by the R community
- Become familiar with the lattice graphics package for high-level data visualization
- Learn about bioinformatics packages provided by Bioconductor
"I am excited about this book. R in a Nutshell is a great introduction to R, as well as a comprehensive reference for using R in data analytics and visualization. A
The Definitive Guide to Microsoft SQL Server 2012 Reporting Services
Create, deploy, and manage business intelligence reports using the expert tips and best practices in this hands-on resource. Written by a member of the original Reporting Services development team, Microsoft SQL Server 2012 Reporting Services, Fourth Edition covers the complete process of building and distributing reports and explains how to maximize all of the powerful, integrated SSRS capabilities, including the new and enhanced features. A detailed case study and sample reports are included in this practical reference.
- Plan for, install, configure, and customize SQL Server 2012 Reporting Services
- Retrieve data with SELECT queries
- Generate reports from the Report Wizard and from scratch
- Enhance your reports with charts, images, gauges, and maps
- Add value to reports through summarizing, totaling, and interactivity
- Build reusable report templates
- Embed Visual Basic, .NET functions and subreports into your reports
- Enable end-user access to reports via the Report Server and its Report Manager web interface
- Integrate SSRS reports with your own websites and custom applications
- Follow along with sample reports from the book's case study
Transform your skills, data, and business—with the power user’s guide to PowerPivot for Excel. Led by two business intelligence (BI) experts, you’ll learn how to create and share your own BI solutions using software you already know and love: Microsoft Excel. Discover how to extend your existing skills, using the PowerPivot add-in to quickly turn mass quantities of data into meaningful information and on-the-job results—no programming required. The book introduces you to PowerPivot functionality, then takes a pragmatic approach to understanding and working with data models, data loading, data manipulation with Data Analysis Expressions (DAX), simple-to-sophisticated calculations, what-if analysis, and PowerPivot patterns. Learn how to create your own, “self-service” BI solutions, then share your results effortlessly across your organization using Microsoft SharePoint®.
A Note Regarding the CD or DVD
The print version of this book ships with a CD or DVD. For those customers purchasing one of the digital formats in which this book is available, we are pleased to offer the CD/DVD content as a free download via O'Reilly Media's Digital Distribution services. To download this content, please visit O'Reilly's web site, search for the title of this book to find its catalog page, and click on the link below the cover image (Examples, Companion Content, or Practice Files). Note that while we provide as much of the media content as we are able via free download, we are sometimes limited by licensing restrictions. Please direct any questions or concerns to booktech@oreilly.com.The leading introductory book on data mining, fully updated and revised!When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition—more than 50% new and revised— is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company.
- Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems
- Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately
- Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more
- Provides best practices for performing data mining using simple tools such as Excel
Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.
-Master Oracle Real Application Clusters
Maintain a dynamic enterprise computing infrastructure with expert instruction from an Oracle ACE. Oracle Database 11g Oracle Real Application Clusters Handbook, Second Edition has been fully revised and updated to cover the latest tools and features. Find out how to prepare your hardware, deploy Oracle Real Application Clusters, optimize data integrity, and integrate seamless failover protection. Troubleshooting, performance tuning, and application development are also discussed in this comprehensive Oracle Press guide.
- Install and configure Oracle Real Application Clusters
- Configure and manage diskgroups using Oracle Automatic Storage Management
- Work with services, voting disks, and Oracle Clusterware Repository
- Look under the hood of the Cache Fusion and Global Resource Directory operations in Oracle Real Applications Clusters
- Explore the internal workings of backup and recovery in Oracle Real Application Clusters
- Employ workload balancing and the Transparent Application Failover feature of an Oracle database
- Get complete coverage of Stretch Clusters, also known as Metro Clusters
- Troubleshoot Oracle Clusterware using the most advanced diagnostics available
- Develop custom Oracle Real Application Clusters applications





















