Data pipeline中文

and extract the chemical entities from patent text and images.

该系统使用数据处理管道从专利文本和图片中识别和提取化学实体。

or other storage systems from many different sources simultaneously.

Logstash:Logstash是一个动态数据处理渠道,用于同时从多个数据源将数据导入Elasticsearch或其他存储系统。

transforms it, and then sends it to a"stash" like Elasticsearch.

Logstash是一个服务器端数据处理管道,它同时从多个源获取数据,对其进行转换,然后将其发送到一个类似于Elasticsearch的“stash”中。

or other storage systems from a multitude of sources simultaneously.

Logstash:Logstash是一个动态数据处理渠道,用于同时从多个数据源将数据导入Elasticsearch或其他存储系统。

Kafka is a messaging system that was originally developed at LinkedIn to serve as the foundation for LinkedIn's activity stream and

Kafka是一个消息系统,原本开发自LinkedIn,用作LinkedIn的活动流(ActivityStream)和运营数据处理管道(Pipeline)的基础。

In fact, many production NLP models are

deeply embedded in the Transform step of“Extract-Transform-Load”(ETL) pipeline of data processing.

实际上,许多生产NLP模型深深嵌入到数据处理的提取,转换和加载(ETL)管道的转换步骤中。

The AWS Data Pipeline helps you effortlessly create complex data processing loads that are error tolerant, reproducible and highly available.

To parallelize data processing across a distributed database cluster, MongoDB provides the aggregation pipeline and MapReduce.

为了跨分布式数据库集群并行处理数据,MongoDB提供了管道聚集和MapReduce两种编程模式。

Apache Storm runs continuously, consuming

data

from the configured sources(spouts)

ApacheStorm持续运行,从配置的源(Spouts)消耗数据,并将数据传递到处理管道(Bolts)。

Apache Falcon is a data processing and management solution for Hadoop designed for

data

motion, coordination of data pipelines, lifecycle management, and

data

discovery.

ApacheFalcon是一个面向Hadoop的、新的数据处理和管理平台,设计用于数据移动、数据管道协调、生命周期管理和数据发现。

Streams- as in reactive streams: unbounded flows of data processing, enabling asynchronous, non-blocking, back-pressured transformation

pipelines

between a multitude of sources and destinations.

流(streams)-响应式流--无限制的数据处理流,支持异步,非阻塞式,支持多个源与目的的反压转换管道(back-pressuredtransformationpipelines)。

These efforts involved extensive quality control and coordinated data processing, as well as massive, systematic experimental validation of the computational

pipelines

used to detect mutations.

这些合作涉及广泛的质量控制,协调数据处理,以及对检测突变的不同计算流程进行大规模、系统性的实验验证。

The computer here needs to run a medical imaging pipeline, which requires taking in sensor data, processing it, and visualizing it.

这里的计算机需要运行一个医学成像管道,它接收传感器数据,对其进行处理,并将其可视化。

ApacheFalcon是一个数据管理框架,用于简化ApacheHadoop上的数据生命周期管理和处理管道。

Because of uncertainties with the data pipeline and processing, the ESA is planning on splitting the third

data

release(DR3)

into two packages.

由于数据通道和处理的不确定性,欧空局计划将第三次数据发布(DR3)分为两个包。

The concept of Pipeline Runners in Beam translates

data 

processing pipelines into an API that's compatible with multiple distributed

processing

backends.

在Beam中,管道运行器(PipelineRunners)会将数据处理管道翻译为与多个分布式处理后端兼容的API。

These skills are relevant to many types of programming,

but many scenarios used will involve data analysis, conversion, validation, and processing pipelines.

这些技能是关系到许多类型的节目,但使用的许多场景将涉及数据分析,转换,验证和处理管道。

You use the Beam

SDK of your choice to build a program that defines your data

processing 

pipeline.

您使用您选择的BeamSDK构建一个定义数据处理管道的程序。

In the real world, it is sometimes impossible to acquire some

data

or

在现实世界中,有时候无法获取一些数据,或者该数据在处理过程中发生了丢失。

The key to FPGA real-time pipeline processing is that it can buffer several lines of image

data

with its internal Block Ram.

FPGA能进行实时流水线处理的关键是它可以用其内部的BlockRam缓存若干行的图像数据。

The main components of the system are a data curation pipeline, an algorithmic computation system, a linguistic

processing

system, and an automated presentation system.

该系统的主要部分包括一个数据整理(datacuration)管道,一个算法计算系统、一个语言学处理系统,还有一个自动化的呈现系统。

This centre will perform the automatic

pipeline 

processing of all the XMM data from all experiments.

该中心将自动处理从所有实验中得到的所有X-射线多镜头数据。

Will provide real-time sensor data for managing bacterial outbreaks from the food processing equipment to oil

pipelines

or pharmaceutical manufacturing.

将提供实时传感器数据,以管理从食品加工设备到输油管道或制药生产的细菌暴发。

Let's see an example of a

data 

processing

pipeline.

让我们考察一个解析挖掘过程的例子。

Thus the shell itself is doing no direct processing of the

data

flowing through the

pipeline.

因此shell本身没有通过管线来处理数据。

Coinbase created a streaming

data

insight pipeline in AWS, with real-time exchange analytics processed by an Amazon Kinesis managed big-data

processing

service.

Coinbase在AWS创建了流数据洞察管道,由AmazonKinesis托管的大数据处理服务进行实时交易分析。

Data Pipeline 做什么?

对于机器学习来说,Data Pipeline的主要任务就是让机器对已有数据进行分析,从而能使机器对新的数据进行合理地判断。

ETL Data Pipeline是什么?

ETL 作为最广为人知的数据流水线模式,主要指数据从提取,经过转换,再最终加载到目标数据源的过程,以提高数据的管理效率,可以帮助数据开发人员更快速的迭代以不断满足业务发展的数据需求。

Pipeline是什么意思啊?

pipeline,中文意为管线,意义等同于 流水线 。 专业一点,叫它综合解决方案,就行。 例子1,最典型的就是Gpu渲染管线,它指渲染一个画面需要经过多少到工序。