Distributed real-time computing framework using in-storage processing

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

Publication Date:
February 05, 2019

معلومة اضافية
- Patent Number:
  10198,293
- Appl. No:
  15/462797
- Application Filed:
  March 17, 2017
- نبذة مختصرة :
  According to a general aspect, a method may include receiving a computing task, wherein the computing task includes a plurality of operations. The method may include allocating the computing task to a data node, wherein the data node includes at least one host processor and an intelligent storage medium, wherein the intelligent storage medium comprises at least one controller processor, and a non-volatile memory, wherein each data node includes at least three processors between the at least one host processor and the at least one controller processor. The method may include dividing the computing task into at least a first chain of operations and a second chain of operations. The method may include assigning the first chain of operations to the intelligent storage medium of the data node. The method may further include assigning the second chain of operations to the central processor of the data node.
- Inventors:
  Samsung Electronics Co., Ltd. (Suwon-si, Gyeonggi-do, KR)
- Assignees:
  SAMSUNG ELECTRONICS CO., LTD. (KR)
- Claim:
  1. A scheduler computing device comprising: a computing task memory configured to store at least one computing task, wherein the computing task is to be executed by a data node of a distributed computing system, wherein the distributed computing system comprises at least one data node, wherein each data node includes at least one host processor and an intelligent storage medium, wherein the intelligent storage medium comprises a form of storage medium that includes at least one controller processor and a non-volatile memory; wherein each data node includes at least three processors, wherein the at least three processors includes the at least one host processor and the at least one controller processor; and a scheduler processor configured to: decide whether to assign the computing task to be executed by either one of the host processors of the data node or one of the controller processors of the intelligent storage medium based, at least in part, upon an amount of output data associated with the computing task compared to an amount of input data associated with the computing task, wherein if the amount of input data is larger than the amount of output data, deciding to assign the computing task to the one of the controller processors of the intelligent storage medium, and assign, according to the decision, the computing task to be executed by either one of the host processors of the data node or one of the controller processors of the intelligent storage medium.
- Claim:
  2. The scheduler computing device of claim 1 , wherein the distributed computing system comprises a heterogeneous plurality of data nodes, wherein the plurality of data nodes includes data nodes having respective controller processors with differing capabilities.
- Claim:
  3. The scheduler computing device of claim 2 , wherein the plurality of data nodes includes: a data node that comprises an intelligent storage medium having a graphical processor.
- Claim:
  4. The scheduler computing device of claim 1 , wherein the scheduler processor is configured to: divide a larger computing task into one or more smaller computing tasks, wherein each of the computing tasks includes a chain of one or more operations, and wherein each smaller computing task is performed by either one of the host processors of the data node or the intelligent storage medium of the data node; classify each smaller computing task into one of at least two categories, wherein each category is capable of being performed by at least one processor; and based upon the category associated with a smaller computing task, assign each respective smaller computing task to either a host processor of the data node or a processor of the intelligent storage medium of the data node.
- Claim:
  5. The scheduler computing device of claim 4 , wherein the scheduler processor is configured to: assign one of the smaller computing tasks to be executed by one of the processors of the intelligent storage medium even if the smaller computing task would be more efficiently executed by one of the host processors, if the larger computing task is made more efficient.
- Claim:
  6. The scheduler computing device of claim 1 , wherein the scheduler processor is configured to: assign the computing task to be executed by either one of the host processors of the data node or one of the processors of the intelligent storage medium, based, at least in part, upon load balancing of computing tasks across a plurality of data nodes.
- Claim:
  7. The scheduler computing device of claim 1 , wherein the scheduler processor is configured to: assign the computing task to be executed by either one of the host processors of the data node or one of the processors of the intelligent storage medium, based, primarily upon an amount of input data compared to an amount of output data, and secondarily upon a characteristic associated with at least one data node.
- Claim:
  8. The scheduler computing device of claim 7 , wherein a characteristic associated with the at least one data node is determined by a preconfigured policy setting.
- Claim:
  9. A method comprising: receiving a computing task, wherein the computing task includes a plurality of operations; allocating the computing task to a data node, wherein the data node includes at least one host processor and an intelligent storage medium, wherein the intelligent storage medium comprises a form of storage medium that includes at least one controller processor and a non-volatile memory, wherein each data node includes at least three processors, wherein the at least three processors includes the at least one host processor and the at least one controller processor; dividing the computing task into at least a first chain of operations and a second chain of operations, wherein dividing the computing task includes determining for each chain of operations an amount of output data associated with a respective chain of operations and an amount of input data associated with the respective chain of operations, and wherein if the amount of input data is larger than the amount of output data, deciding to assign the computing task to the one of the controller processors of the intelligent storage medium; assigning the first chain of operations to the intelligent storage medium of the data node; and assigning the second chain of operations to a central processor of the data node.
- Claim:
  10. The method of claim 9 , wherein the distributed computing system comprises a heterogeneous plurality of data nodes, wherein the plurality of data nodes includes data nodes having respective controller processors with differing capabilities.
- Claim:
  11. The method of claim 9 , wherein dividing includes: further dividing each chain of operations into a plurality of sub-chains; and assigning each sub-chain to a different processor.
- Claim:
  12. The method of claim 9 , wherein dividing includes: dividing based, at least in part, upon load balancing of computing tasks across the plurality of data nodes.
- Claim:
  13. The method of claim 9 , wherein dividing includes: dividing the computing task based, primarily upon an amount of input data compared to an amount of output data, and secondarily upon a characteristic associated with the plurality of data nodes.
- Claim:
  14. The method of claim 13 , wherein a characteristic associated with the plurality of data nodes is determined by a preconfigured policy setting.
- Claim:
  15. A data node comprising: a central processor configured to execute at least one of a first set of operations upon data stored by an intelligent storage medium; the intelligent storage medium comprising: a memory configured to store data, a first controller processor configured to execute at least one of a second set of operations upon data stored by the intelligent storage medium, and a second controller processor configured to execute at least one of a third set of operations upon data stored by the intelligent storage medium; and a network interface configured to receive a plurality of operations from a scheduling computing device; and wherein the data node is configured to: divide the computing task into at least the first set of operations and a second set of operations, wherein dividing the computing task includes determining for each chain of operations an amount of output data associated with a respective chain of operations and an amount of input data associated with the respective chain of operations, and wherein if the amount of input data is larger than the amount of output data, deciding to assign the computing task to the one of the controller processors of the intelligent storage medium, assign the first set of operations to the central processor for execution, and assign the second set of operations to the intelligent storage medium for execution.
- Claim:
  16. The data node of claim 15 , wherein the first controller processor includes a general purpose processor, and wherein the second controller processor includes a graphical processor.
- Claim:
  17. The data node of claim 15 , wherein the first controller processor includes a general purpose processor, and wherein the second controller processor includes a re-programmable processor.
- Claim:
  18. The data node of claim 15 , wherein that data node is configured to: assign the computing task to be executed by either one of the processors of the data node or one of the processors of the intelligent storage medium, based, at least in part, upon load balancing of computing tasks across a plurality of data nodes.
- Claim:
  19. The data node of claim 15 , wherein that data node is configured to: assign the computing task to be executed by either one of the processors of the data node or one of the processors of the intelligent storage medium, based, primarily upon an amount of input data compared to an amount of output data, and secondarily upon a characteristic associated with at least one data node.
- Claim:
  20. The data node of claim 19 , wherein a characteristic associated with the at least one data node is determined by a preconfigured policy setting.
- Patent References Cited:
  7657706 February 2010 Iyer
  8566831 October 2013 Jellinek et al.
  8819335 August 2014 Salessi et al.
  9477511 October 2016 Jacobson
  2013/0191555 July 2013 Liu
- Other References:
  Ji, Changqing et al., “Big Data Processing: Big Challenges and Opportunities,” Journal of Interconnection Networks, vol. 13, Nos. 3 & 4, Dec. 13-15, 2012, 21 pages. cited by applicant
  Kang, Yangwook, et al., “Enabling Cost-effective Data Processing with Smart SSD,” 2013, 12 pages. cited by applicant
  Tran, Nam-Luc et al., “AROM: Processing Big Data With Data Flow Graphs and Functional Programming,” 2012, 8 pages. cited by applicant
  Zhang, Yanfeng, et al., “Maiter: An Asynchronous Graph Processing Framework for Delta-based Accumulative Iterative Computation,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, Issue 8, Aug. 2014, published Sep. 16, 2013, 17 pages. cited by applicant
- Primary Examiner:
  Zhao, Bing
- Attorney, Agent or Firm:
  Renaissance IP Law Group LLP
- الرقم المعرف:
  edspgr.10198293

تعليقات

No Comments.

Distributed real-time computing framework using in-storage processing

اتصل بنا

اتبع