3 Cloud Security Mistakes You’re Likely Making Without Knowing

Those hastily moving to post-pandemic cloud-based platforms are likely to make some major security mistakes, depending on how fast they are moving. Why? This is new to most of them, there are few known best practices for cloud security, and humans get overwhelmed with the tasks of securely moving to the cloud quickly.

I’ve put together a short list of some of the security mistakes I see as enterprises rush to the cloud.

Mistake 1: Not gathering and reacting to operational security data in real time.

The notion of SIEM (security information and event management) means gathering operational security data in a central location to manage existing or forthcoming incidents in real time. We can leverage data as a weapon: supporting audits, correlating data, and using predictive analytics, all to gain better insights as to the state of security and to proactively combat attacks.

Mistake 2: Not dealing with data security at the database levels.

Data security is really considered storage security by most of those who manage security in the cloud. This is a huge mistake, considering that data has special security needs, including governance and compliance policies for the data and how they link to security. Most important is the ability to manage security down to the row and object levels, ensuring that data can be protected in fine-grained ways. This typically means dealing with native database security and metadata management systems, something that most cloud security pros don’t understand. Not understanding security at the data level will likely lead to an external or accidental data loss event at some point.

Mistake 3: Not having a vision for cloud security.

An old boss of mine said: “You need to spend at least 10 percent of the time dreaming about what’s possible.” Those charged with cloud security need to focus on what’s next, as well as what’s now.

By the time you’ve set a course and deployed a technology solution around your planning and vision, two years will have passed for most enterprises—an eternity at the pace of cloud computing security.

Chances are you’re making at least one of these mistakes. If you’re not, congratulations. In the real world of cloud security, we need to be reinventing things continuously. That’s the ultimate best practice.

This post was originally published on this site

Total
0
Shares
Previous Article

Intra-Cloud Latency Is Real. Here's What To Do About It

Next Article

A Pragmatic Approach To The Post-Pandemic Run On Cloud Computing

Related Posts
Read More

Why Data-Driven Businesses Need A Data Catalog

Relational databases, data lakes, and NoSQL data stores are powerful at inserting, updating, querying, searching, and processing data. But the ironic aspect of working with data management platforms is they usually don’t provide robust tools or user interfaces to share what’s inside them. They are more like data vaults. You know there’s valuable data inside, but you have no easy way to assess it from the outside. The business challenge is dealing with a multitude of data vaults: multiple enterprise databases, smaller data stores, data centers, clouds, applications, BI tools, APIs, spreadsheets, and open data sources. Sure, you can query a relational database’s metadata for a list of tables, stored procedures, indexes, and other database objects to get a directory. But that is a time-consuming approach that requires technical expertise and only produces a basic listing from a single data source. You can use tools that will reverse engineer data models or provide ways to navigate the metadata. But these tools are more often designed for technologists and mainly used for auditing, documenting, or analyzing databases. In other words, these approaches to query the contents of databases and the tools to extract their metadata are insufficient for today’s data-driven business needs for several reasons: The technologies require too much technical expertise and are unlikely to be used by less-technical end-users. The methods are too manual for enterprises with multiple big data databases, disparate database technologies, and operating hybrid clouds. The approaches are not particularly useful to data scientists or citizen data scientists who want to work collaboratively or run machine learning experiments with primary and derived data sets. The strategy of auditing database metadata doesn’t make it easy for data management teams to institute proactive data governance. A single source of truth of an organization’s data assets Data catalogs have been around for some time and have become more strategic today as organizations scale big data platforms, operate in hybrid clouds, invest in data science and machine learning programs, and sponsor data-driven organizational behaviors. The first concept to understand about data catalogs is that they are tools for the entire organization to learn and collaborate around data sources. They are important to organizations trying to be more data-driven, ones with data scientists experimenting with machine learning, and others embedding analytics in customer-facing applications. Database engineers, software developers, and other technologists take on responsibilities to integrate data catalogs with the primary enterprise data sources. They also use and contribute to the data catalog, especially when databases are created or updated. In that respect, data catalogs that interface with the majority of an enterprise’s data assets are a single source of truth. They help answer what data exists, how to find the best data sources, how to protect data, and who has expertise. The data catalog includes tools to discover data sources, capture metadata about those sources, search them, and provide some metadata management capabilities. Many data catalogs go beyond the notion of a structured directory. Data catalogs often include relationships between data sources, entities, and objects. Most catalogs track different classes of metadata, especially on confidentiality, privacy, and security. They capture and share information on how different people, departments, and applications utilize data sources. Most data catalogs also include tools to define data dictionaries; some bundle in tools to profile data, cleanse data, and perform other data stewardship functions. Specialized data catalogs also enable or interface with master data management and data lineage capabilities. Data catalog products and services The market is full of data catalog tools and platforms. Some products grew out of other infrastructure and enterprise data management capabilities. Others represent a new generation of capabilities and focus on ease of use, collaboration, and machine learning differentiators. Naturally, choice will depend on scale, user experience, data science strategy, data architecture, and other organization requirements.  Here is a sample of data catalog products: Azure Data Catalog and AWS Glue are data cataloging services built into public cloud platforms. Many data integration platforms have data cataloging capabilities, including Informatica Enterprise Data Catalog, Talend Data Catalog, SAP Data Hub, and IBM Infosphere Information Governance Catalog. Some data catalogs are designed for big data platforms and hybrid clouds, such as Cloudera Data Platform and InfoWorks DataFoundry, which supports data operations and orchestration. There are stand-alone platforms with machine learning capabilities, including Unifi Data Catalog, Alation Data Catalog, Collibra Catalog, Waterline Data, and IBM Watson Knowledge Catalog. Master data management tools such as Stibo Systems and Reltio and customer data platforms such as Arm Treasure Data can also function as data catalogs. Machine learning capabilities drive insights and experimentation Data catalogs that automate data discovery, enable searching the repository, and provide collaboration tools are the basics. More advanced data catalogs include capabilities in machine learning, natural language processing, and low-code implementations. Machine learning capabilities take on several forms depending on the platform. For example, Unifi has a built-in recommendation engine that reviews how people are using, joining, and labeling primary and derived data sets. It captures utilization metrics and uses machine learning to make recommendations when other end-users query for similar data sets and patterns. Unifi also uses machine learning algorithms to profile data, identify sensitive personally identifiable information, and tag data sources. Collibra is using machine learning to help data stewards classify data. Automatic Data Classification analyzes new data sets and matches to 40 out-of-the-box classifications, such as addresses, financial information, and product identifiers. Waterline Data has patented fingerprinting technology that automates the discovery, classification, and management of enterprise data. One of their focus areas is identifying and tagging sensitive data; they claim to reduce the time needed for tagging by 80 percent. Different platforms have different strategies and technical capabilities around data processing. Some only function at a data catalog and metadata level, whereas others have extended data prep, integration, cleansing, and other data operational capabilities. InfoWorks DataFoundry is an enterprise data operations and orchestration system that has direct integration with machine learning algorithms. It has a low-code, visual programming interface enabling end-users to connect data with machine learning algorithms such as k-means clustering and random forest classification. We’re in the early stages of proactive platforms such as data catalogs that provide governance, operational capabilities, and discovery tools for enterprises with growing data assets. As organizations realize more business value from data and analytics, there will be a greater need to scale and manage data practices. Machine learning capabilities will likely be one area where different data catalog platforms compete.