/r/DistributedComputing

Photograph via snooOG

For those who want to cure cancer and make first contact in their spare time

To discuss distributed computing projects such as Folding@Home or the many BOINC programs.

The centre for reddit teams:

/r/DistributedComputing

3,658 Subscribers

10

"Parallel-Committees": A Novelle Secure and High-Performance Distributed Database Architecture

In my PhD thesis, I proposed a novel fault-tolerant, self-configurable, scalable, secure, decentralized, and high-performance distributed database replication architecture, named “Parallel Committees”.

I utilized an innovative sharding technique to enable the use of Byzantine Fault Tolerance (BFT) consensus mechanisms in very large-scale networks.

With this innovative full sharding approach supporting both processing sharding and storage sharding, as more processors and replicas join the network, the system computing power and storage capacity increase unlimitedly, while a classic BFT consensus is utilized.

My approach also allows an unlimited number of clients to join the system simultaneously without reducing system performance and transactional throughput.

I introduced several innovative techniques: for distributing nodes between shards, processing transactions across shards, improving security and scalability of the system, proactively circulating committee members, and forming new committees automatically.

I introduced an innovative and novel approach to distributing nodes between shards, using a public key generation process, called “KeyChallenge”, that simultaneously mitigates Sybil attacks and serves as a proof-of-work. The “KeyChallenge” idea is published in the peer-reviewed conference proceedings of ACM ICCTA 2024, Vienna, Austria.

In this regard, I proved that it is not straightforward for an attacker to generate a public key so that all characters of the key match the ranges set by the system.I explained how to automatically form new committees based on the rate of candidate processor nodes.

The purpose of this technique is to optimally use all network capacity so that inactive surplus processors in the queue of a committee that were not active are employed in the new committee and play an effective role in increasing the throughput and the efficiency of the system.

This technique leads to the maximum utilization of processor nodes and the capacity of computation and storage of the network to increase both processing sharding and storage sharding as much as possible.

In the proposed architecture, members of each committee are proactively and alternately replaced with backup processors. This technique of proactively circulating committee members has three main results:

  • (a) preventing a committee from being occupied by a group of processor nodes for a long time period, in particular, Byzantine and faulty processors,
  • (b) preventing committees from growing too much, which could lead to scalability issues and latency in processing the clients’ requests,
  • (c) due to the proactive circulation of committee members, over a given time-frame, there exists a probability that several faulty nodes are excluded from the committee and placed in the committee queue. Consequently, during this time-frame, the faulty nodes in the committee queue do not impact the consensus process.

This procedure can improve and enhance the fault tolerance threshold of the consensus mechanism.I also elucidated strategies to thwart the malicious action of “Key-Withholding”, where previously generated public keys are prevented from future shard access. The approach involves periodically altering the acceptable ranges for each character of the public key. The proposed architecture effectively reduces the number of undesirable cross-shard transactions that are more complex and costly to process than intra-shard transactions.

I compared the proposed idea with other sharding-based data replication systems and mentioned the main differences, which are detailed in Section 4.7 of my dissertation.

The proposed architecture not only opens the door to a new world for further research in this field but also represents a significant step forward in enhancing distributed databases and data replication systems.

The proposed idea has been published in the peer-reviewed conference proceedings of IEEE BCCA 2023.

Additionally, I provided an explanation for the decision not to employ a blockchain structure in the proposed architecture, an issue that is discussed in great detail in Chapter 5 of my dissertation.

The complete version of my dissertation is accessible via the following link: https://www.researchgate.net/publication/379148513_Novel_Fault-Tolerant_Self-Configurable_Scalable_Secure_Decentralized_and_High-Performance_Distributed_Database_Replication_Architecture_Using_Innovative_Sharding_to_Enable_the_Use_of_BFT_Consensus_Mec

I compared my proposed database architecture with various distributed databases and data replication systems in Section 4.7 of my dissertation. This comparison included Apache Cassandra, Amazon DynamoDB, Google Bigtable, Google Spanner, and ScyllaDB. I strongly recommend reviewing that section for better clarity and understanding.

The main problem is as follows:

Classic consensus mechanisms such as Paxos or PBFT provide strong and strict consistency in distributed databases. However, due to their low scalability, they are not commonly used. Instead, methods such as eventual consistency are employed, which, while not providing strong consistency, offer much higher performance compared to classic consensus mechanisms. The primary reason for the low scalability of classic consensus mechanisms is their high time complexity and message complexity.

I recommend watching the following video explaining this matter:
https://www.college-de-france.fr/fr/agenda/colloque/taking-stock-of-distributed-computing/living-without-consensus

My proposed architecture enables the use of classic consensus mechanisms such as Paxos, PBFT, etc., in very large and high-scale networks, while providing very high transactional throughput. This ensures both strict consistency and high performance in a highly scalable network. This is achievable through an innovative approach of parallelization and sharding in my proposed architecture.

If needed, I can provide more detailed explanations of the problem and the proposed solution.

I would greatly appreciate feedback and comments on the distributed database architecture proposed in my PhD dissertation. Your insights and opinions are invaluable, so please feel free to share them without hesitation.

0 Comments
2024/05/10
12:57 UTC

2

Tinykv with TinySQL cluster deployment error

I get this error when I try to deploy tinykv cluster as shown in the repo of talent-plan/tinykv: A course to build distributed key-value service based on TiKV model (github.com)

mkdir -p data

./tinyscheduler-server

./tinykv-server -path=data

./tinysql-server --store=tikv --path="127.0.0.1:2379"

mysql -u root -h 127.0.0.1 -P 4000

you can find the implementations here:

sakura-ysy/TinyKV-2022-doc: TinyKV-2022,个人代码及文档,项目最终得分98.46。 (github.com)

RinChanNOWWW/tinysql-impl: Implementation of https://github.com/tidb-incubator/tinysql

[2024/04/05 20:17:19.026 +00:00] [WARN] [session.go:539] ["run statement failed"] [schemaVersion=0] [error="[schema:1049]Unknown database 'mysql'"] [errorVerbose="[schema:1049]Unknown database 'mysql'\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/tidb/parser/terror.(*Error.GenWithStackByArgs\n\t/go/tinysql/parser/terror/terror.go:243\ngithub.com/pingcap/tidb/executor.(*SimpleExec.executeUse\n\t/go/tinysql/executor/simple.go:66\ngithub.com/pingcap/tidb/executor.(*SimpleExec.Next\n\t/go/tinysql/executor/simple.go:49\ngithub.com/pingcap/tidb/executor.Next\n\t/go/tinysql/executor/executor.go:161\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelayExecutor\n\t/go/tinysql/executor/adapter.go:227\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelay\n\t/go/tinysql/executor/adapter.go:214\ngithub.com/pingcap/tidb/executor.(*ExecStmt.Exec\n\t/go/tinysql/executor/adapter.go:190\ngithub.com/pingcap/tidb/session.runStmt\n\t/go/tinysql/session/tidb.go:219\ngithub.com/pingcap/tidb/session.(*session.executeStatement\n\t/go/tinysql/session/session.go:536\ngithub.com/pingcap/tidb/session.(*session.execute\n\t/go/tinysql/session/session.go:615\ngithub.com/pingcap/tidb/session.(*session.Execute\n\t/go/tinysql/session/session.go:563\ngithub.com/pingcap/tidb/session.checkBootstrapped\n\t/go/tinysql/session/bootstrap.go:162\ngithub.com/pingcap/tidb/session.bootstrap\n\t/go/tinysql/session/bootstrap.go:130\ngithub.com/pingcap/tidb/session.runInBootstrapSession\n\t/go/tinysql/session/session.go:792\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/go/tinysql/session/session.go:753\nmain.createStoreAndDomain\n\t/go/tinysql/tidb-server/main.go:133\nmain.main\n\t/go/tinysql/tidb-server/main.go:105\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"] [session="{\n \"currDBName\": \"\",\n \"id\": 0,\n \"status\": 2,\n \"strictMode\": true,\n \"user\": \"\"\n}"]

[2024/04/05 20:17:19.026 +00:00] [WARN] [session.go:606] ["compile SQL failed"] [error="[schema:1146]Table 'mysql.tidb' doesn't exist"] [errorVerbose="[schema:1146]Table 'mysql.tidb' doesn't exist\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/tidb/parser/terror.(*Error.GenWithStackByArgs\n\t/go/tinysql/parser/terror/terror.go:243\ngithub.com/pingcap/tidb/infoschema.(*infoSchema.TableByName\n\t/go/tinysql/infoschema/infoschema.go:169\ngithub.com/pingcap/tidb/planner/core.(*preprocessor.handleTableName\n\t/go/tinysql/planner/core/preprocess.go:517\ngithub.com/pingcap/tidb/planner/core.(*preprocessor.Leave\n\t/go/tinysql/planner/core/preprocess.go:118\ngithub.com/pingcap/tidb/parser/ast.(*TableName.Accept\n\t/go/tinysql/parser/ast/dml.go:147\ngithub.com/pingcap/tidb/parser/ast.(*TableSource.Accept\n\t/go/tinysql/parser/ast/dml.go:191\ngithub.com/pingcap/tidb/parser/ast.(*Join.Accept\n\t/go/tinysql/parser/ast/dml.go:76\ngithub.com/pingcap/tidb/parser/ast.(*TableRefsClause.Accept\n\t/go/tinysql/parser/ast/dml.go:292\ngithub.com/pingcap/tidb/parser/ast.(*SelectStmt.Accept\n\t/go/tinysql/parser/ast/dml.go:449\ngithub.com/pingcap/tidb/planner/core.Preprocess\n\t/go/tinysql/planner/core/preprocess.go:42\ngithub.com/pingcap/tidb/executor.(*Compiler.Compile\n\t/go/tinysql/executor/compiler.go:34\ngithub.com/pingcap/tidb/session.(*session.execute\n\t/go/tinysql/session/session.go:603\ngithub.com/pingcap/tidb/session.(*session.Execute\n\t/go/tinysql/session/session.go:563\ngithub.com/pingcap/tidb/session.getTiDBVar\n\t/go/tinysql/session/bootstrap.go:191\ngithub.com/pingcap/tidb/session.checkBootstrapped\n\t/go/tinysql/session/bootstrap.go:168\ngithub.com/pingcap/tidb/session.bootstrap\n\t/go/tinysql/session/bootstrap.go:130\ngithub.com/pingcap/tidb/session.runInBootstrapSession\n\t/go/tinysql/session/session.go:792\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/go/tinysql/session/session.go:753\nmain.createStoreAndDomain\n\t/go/tinysql/tidb-server/main.go:133\nmain.main\n\t/go/tinysql/tidb-server/main.go:105\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"] [SQL="SELECT HIGH_PRIORITY VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME=\"bootstrapped\""]

[2024/04/05 20:17:19.037 +00:00] [INFO] [region_cache.go:976] ["mark store's regions need be refill"] [store=127.0.0.1:20160]

[2024/04/05 20:17:19.037 +00:00] [INFO] [region_cache.go:402] ["switch region peer to next due to send request fail"] [current="region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 127.0.0.1:20160, idx: 0"] [needReload=true] [error="rpc error: code = Unknown desc = responses count 1 is not equal to requests count 2"] [errorVerbose="rpc error: code = Unknown desc = responses count 1 is not equal to requests count 2\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\ngithub.com/pingcap/tidb/store/tikv/tikvrpc.CallRPC\n\t/go/tinysql/store/tikv/tikvrpc/tikvrpc.go:319\ngithub.com/pingcap/tidb/store/tikv.(*rpcClient.SendRequest\n\t/go/tinysql/store/tikv/client.go:225\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.sendReqToRegion\n\t/go/tinysql/store/tikv/region_request.go:142\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.SendReqCtx\n\t/go/tinysql/store/tikv/region_request.go:112\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.SendReq\n\t/go/tinysql/store/tikv/region_request.go:70\ngithub.com/pingcap/tidb/store/tikv.(*tikvStore.SendReq\n\t/go/tinysql/store/tikv/kv.go:312\ngithub.com/pingcap/tidb/store/tikv.actionPrewrite.handleSingleBatch\n\t/go/tinysql/store/tikv/2pc.go:367\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnBatches\n\t/go/tinysql/store/tikv/2pc.go:313\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnKeys\n\t/go/tinysql/store/tikv/2pc.go:301\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.prewriteKeys\n\t/go/tinysql/store/tikv/2pc.go:533\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.execute\n\t/go/tinysql/store/tikv/2pc.go:572\ngithub.com/pingcap/tidb/store/tikv.(*tikvTxn.Commit\n\t/go/tinysql/store/tikv/txn.go:188\ngithub.com/pingcap/tidb/kv.RunInNewTxn\n\t/go/tinysql/kv/txn.go:61\ngithub.com/pingcap/tidb/ddl.(*ddl.genGlobalIDs\n\t/go/tinysql/ddl/ddl.go:370\ngithub.com/pingcap/tidb/ddl.(*ddl.CreateSchema\n\t/go/tinysql/ddl/ddl_api.go:53\ngithub.com/pingcap/tidb/executor.(*DDLExec.executeCreateDatabase\n\t/go/tinysql/executor/ddl.go:124\ngithub.com/pingcap/tidb/executor.(*DDLExec.Next\n\t/go/tinysql/executor/ddl.go:79\ngithub.com/pingcap/tidb/executor.Next\n\t/go/tinysql/executor/executor.go:161\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelayExecutor\n\t/go/tinysql/executor/adapter.go:227\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelay\n\t/go/tinysql/executor/adapter.go:214\ngithub.com/pingcap/tidb/executor.(*ExecStmt.Exec\n\t/go/tinysql/executor/adapter.go:190\ngithub.com/pingcap/tidb/session.runStmt\n\t/go/tinysql/session/tidb.go:219\ngithub.com/pingcap/tidb/session.(*session.executeStatement\n\t/go/tinysql/session/session.go:536\ngithub.com/pingcap/tidb/session.(*session.execute\n\t/go/tinysql/session/session.go:615\ngithub.com/pingcap/tidb/session.(*session.Execute\n\t/go/tinysql/session/session.go:563\ngithub.com/pingcap/tidb/session.mustExecute\n\t/go/tinysql/session/bootstrap.go:280\ngithub.com/pingcap/tidb/session.doDDLWorks\n\t/go/tinysql/session/bootstrap.go:215\ngithub.com/pingcap/tidb/session.bootstrap\n\t/go/tinysql/session/bootstrap.go:138\ngithub.com/pingcap/tidb/session.runInBootstrapSession\n\t/go/tinysql/session/session.go:792\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/go/tinysql/session/session.go:753"]

[2024/04/05 20:17:19.115 +00:00] [INFO] [region_cache.go:308] ["invalidate current region, because others failed on same store"] [region=2] [store=127.0.0.1:20160]

[2024/04/05 20:17:39.127 +00:00] [INFO] [region_cache.go:976] ["mark store's regions need be refill"] [store=127.0.0.1:20160]

[2024/04/05 20:17:39.127 +00:00] [INFO] [region_cache.go:402] ["switch region peer to next due to send request fail"] [current="region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 127.0.0.1:20160, idx: 0"] [needReload=true] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="rpc error: code = DeadlineExceeded desc = context deadline exceeded\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\ngithub.com/pingcap/tidb/store/tikv/tikvrpc.CallRPC\n\t/go/tinysql/store/tikv/tikvrpc/tikvrpc.go:319\ngithub.com/pingcap/tidb/store/tikv.(*rpcClient.SendRequest\n\t/go/tinysql/store/tikv/client.go:225\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.sendReqToRegion\n\t/go/tinysql/store/tikv/region_request.go:142\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.SendReqCtx\n\t/go/tinysql/store/tikv/region_request.go:112\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender.SendReq\n\t/go/tinysql/store/tikv/region_request.go:70\ngithub.com/pingcap/tidb/store/tikv.(*tikvStore.SendReq\n\t/go/tinysql/store/tikv/kv.go:312\ngithub.com/pingcap/tidb/store/tikv.(*LockResolver.resolveLock\n\t/go/tinysql/store/tikv/lock_resolver.go:352\ngithub.com/pingcap/tidb/store/tikv.(*LockResolver.ResolveLocks\n\t/go/tinysql/store/tikv/lock_resolver.go:194\ngithub.com/pingcap/tidb/store/tikv.actionPrewrite.handleSingleBatch\n\t/go/tinysql/store/tikv/2pc.go:404\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnBatches\n\t/go/tinysql/store/tikv/2pc.go:313\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnKeys\n\t/go/tinysql/store/tikv/2pc.go:301\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.prewriteKeys\n\t/go/tinysql/store/tikv/2pc.go:533\ngithub.com/pingcap/tidb/store/tikv.actionPrewrite.handleSingleBatch\n\t/go/tinysql/store/tikv/2pc.go:380\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnBatches\n\t/go/tinysql/store/tikv/2pc.go:313\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.doActionOnKeys\n\t/go/tinysql/store/tikv/2pc.go:301\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.prewriteKeys\n\t/go/tinysql/store/tikv/2pc.go:533\ngithub.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter.execute\n\t/go/tinysql/store/tikv/2pc.go:572\ngithub.com/pingcap/tidb/store/tikv.(*tikvTxn.Commit\n\t/go/tinysql/store/tikv/txn.go:188\ngithub.com/pingcap/tidb/kv.RunInNewTxn\n\t/go/tinysql/kv/txn.go:61\ngithub.com/pingcap/tidb/ddl.(*ddl.genGlobalIDs\n\t/go/tinysql/ddl/ddl.go:370\ngithub.com/pingcap/tidb/ddl.(*ddl.CreateSchema\n\t/go/tinysql/ddl/ddl_api.go:53\ngithub.com/pingcap/tidb/executor.(*DDLExec.executeCreateDatabase\n\t/go/tinysql/executor/ddl.go:124\ngithub.com/pingcap/tidb/executor.(*DDLExec.Next\n\t/go/tinysql/executor/ddl.go:79\ngithub.com/pingcap/tidb/executor.Next\n\t/go/tinysql/executor/executor.go:161\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelayExecutor\n\t/go/tinysql/executor/adapter.go:227\ngithub.com/pingcap/tidb/executor.(*ExecStmt.handleNoDelay\n\t/go/tinysql/executor/adapter.go:214\ngithub.com/pingcap/tidb/executor.(*ExecStmt.Exec\n\t/go/tinysql/executor/adapter.go:190\ngithub.com/pingcap/tidb/session.runStmt\n\t/go/tinysql/session/tidb.go:219\ngithub.com/pingcap/tidb/session.(*session.executeStatement\n\t/go/tinysql/session/session.go:536\ngithub.com/pingcap/tidb/session.(*session.execute\n\t/go/tinysql/session/session.go:615"]

[2024/04/05 20:17:39.241 +00:00] [INFO] [region_cache.go:308] ["invalidate current region, because others failed on same store"] [region=2] [store=127.0.0.1:20160]

[2024/04/05 20:17:41.532 +00:00] [INFO] [domain.go:126] ["full load InfoSchema success"] [usedSchemaVersion=0] [neededSchemaVersion=0] ["start time"=2.059864ms]

0 Comments
2024/04/05
21:03 UTC

1

Model Parallelism using Pytorch and sockets

Hey! Can somebody share a sample implementation of model parallelism using pytorch and sockets, i have a project presentation coming up and its a tight schedule :/
Thanks in advance.

0 Comments
2024/04/05
04:23 UTC

1

Framework to distribute the running of LLMs on separate edge devices.

Hey Fellas!

My course project involves making a framework that uses each of our phones to try and distribute the running of a LLM. Motive is to eliminate the dependancy on a central server (like how all APIs function). How can i achieve this ? Using sockets/ Open MPI, etc ??

Can you help me with the project architecture too please? (P2P OR Master Slave - Algos like chord ?)

I'm new to this and any suggestions would be grateful.

2 Comments
2024/03/31
09:46 UTC

2

New BOINC 8.0.0 is ready for testing

0 Comments
2024/03/29
00:16 UTC

3

Anyone cognizant of the distributed computing DreamLab app/platform?

It's a great way to contribute to projects focused on cancer research, long COVID, & Cyclone weather modeling. The app is available for iOS & Android. You can also run it on Windows 11 & Linux distros with emulators. There's also an unofficial subreddit, which has been running for a couple of years now.

0 Comments
2024/03/28
19:19 UTC

1

PowerShell RM vs REST-API

So in my company we are changing our main workflow from one machine to a DisCo setup. 5 machines. One UI and 4 nodes (backend), they all are windows. The worfklow can be simplified by: UI needs to launch a Java program in the nodes. When calculation is complete the UI needs to know. Before, with only one backend machine, we had run the workflow using PsExec and now that we are enhancing the system we want to move away from it and implement the best approach. Problem is that most of the seniors are windows-oriented and they want to use PowerShell RM instead of a REST-API. Am I too biased for preferring the latter? I don't want to be attached to Windows OS but from security and reliability POV, is actually PS the best approach between these two?

Thanks in advance for your insights!

0 Comments
2024/03/25
18:30 UTC

4

References to design fintech system?

Hi, folks, I have been tasked to create a fintech system. This means the systems needs to be

  1. Highly auditable
  2. Correct
  3. Verifiable
  4. Secure.

Any references would help, books, courses, blogs etc.

8 Comments
2024/03/24
18:58 UTC

4

Dask Demo Day: Dask on Databricks, scale embedding pipelines, and Prefect on the cloud

I wanted to share the talks from last month’s Dask Demo Day, where folks from the Dask community give short demos to show off ongoing work. Hopefully this helps elevate some of the great work people are doing.

Last month’s talks:

  • One trillion row challenge
  • Deploy Dask on Databricks with dask-databricks
  • Deploy Prefect workflows on the cloud with Coiled
  • Scale embedding pipelines (LlamaIndex + Dask)
  • Use AWS Cost Explorer to see the cost of public IPv4 addresses

Recording on YouTube: https://www.youtube.com/watch?v=07e1JL83ur8

Join the next one this Thursday, March 21st, 11am ET https://github.com/dask/community/issues/307

1 Comment
2024/03/19
15:28 UTC

3

What do you think about Oxia: a new high-performant metadata store?

Oxia is a new metadata store and coordination system similar to Zookeeper or Etcd. But in comparison with the others, it can store 100s of GBs of data and can handle millions of reads and writes per second.

Oxia GitHub repository: https://github.com/streamnative/oxia

It's developed by StreamNative - the company that works mostly on the Apache Pulsar.

Disclaimer: I'm NOT a StreamNative employee, but recently became an Apache Pulsar committer.

Oxia is licensed under Apache License 2.0. From public discussions, I know that StreamNavtive has plans to donate it to the Apache Software Foundation, but they haven't shared a concrete date yet.

The Java client has been just open-sourced. Here is a blog post: https://streamnative.io/blog/the-oxia-java-client-library-is-now-open-source

Oxia is already integrated with Pulsar but isn't a default option yet. Folks from StreamNative claim that they used it in the cloud for a few months without any problems.

I didn't see any independent benchmarks yet, therefore I can't validate the performance and stability claims. Probably this post may change it and attract engineers who are interested in trying Oxia for their projects and publicly share the results.

I think it may be interesting for engineers who work on building distributed systems.
I would greatly appreciate hearing about your thoughts on the project!

2 Comments
2024/03/06
15:47 UTC

4

HPC

  1. Why are most of the HPC job prospects here are from software Dev side? Is HPC mostly used by soft Dev in companies? How about ML + HPC? Or other applications except for software developing side?

  2. Another question is ghat are HPC experts paid low? Many here are always stating, "don't expect too much in this field", "companies don't really need hpc expert so", etc. If yes which then which side of HPC gets paid more (as if architect, security, ops, soft Dev, network, computing)?

0 Comments
2024/03/03
04:34 UTC

4

Elevate Your Computing Experience: Deploy Your Virtual Machine on CUDOS Intercloud Today!

Experience the future of cloud computing with CUDOS Intercloud! By deploying your own virtual machine, you unlock unparalleled benefits in scalability, cost efficiency, and customization. With the ability to tailor storage options and access diverse payment methods, including OSMO and USDCaxl on Osmosis, the platform offers unmatched flexibility. What's more, CUDOS Intercloud ensures your peace of mind with an exclusive USD guarantee for CUDOS token users, minimizing slippage losses. Join the revolution in decentralized computing and enjoy seamless performance with CUDOS Intercloud. Don't miss out on the opportunity to elevate your computing experience to new heights – start deploying your virtual machine today! intercloud.cudos.org 🚀

0 Comments
2024/02/19
15:59 UTC

2

Distributed Computing Competition for HS Students

0 Comments
2024/02/18
21:44 UTC

10

How to get into distributed computing?

I mean where do I get a distributed system to play with? Why should I aim for a distributed system in the first place?

I am fairly interested In trying some hpc adjacent things on a distributed setup but not sure how to go about it.

12 Comments
2024/02/16
09:49 UTC

2

Building invincible, durable microservices in cloud computing - Golem's platform becomes 100% open source.

Golem Goes Open Source revolutionizing durable execution, and ensuring invincibility against code updates and failures. Contribute to the next-gen infrastructure, and learn more about Golem: https://www.golem.cloud/post/golem-goes-open-source.

0 Comments
2024/02/05
11:05 UTC

1

BOINC 7.24.3 released for Mac

0 Comments
2024/01/30
21:02 UTC

3

How does CUDOS empower Cloud Computing Technology?

CUDOS empowers cloud computing through decentralized, cost-efficient, and secure computing. It democratizes access by incentivizing individuals and organizations to contribute spare computing resources. This decentralized model enhances scalability, flexibility, and global accessibility, reducing costs and reliance on centralized servers. By incorporating blockchain, CUDOS ensures security and transparency, mitigating the risk of data breaches. The incentive structure encourages a diverse network of resource providers, fostering a collaborative ecosystem for innovation. CUDOS revolutionizes cloud computing by addressing inefficiencies and offering a resilient, distributed, and economical alternative to traditional cloud services. Embrace the future with r/Cudos_Official! https://cudos.org

0 Comments
2024/01/30
16:37 UTC

3

Dask Demo Day: Apache Beam on Dask, expressions for Dask Array, and 1BRC for Dask vs Spark

Today's talks:

- Apache Beam DaskRunner
- Array expressions
- One billion row challenge in Dask vs. Spark

Recording available on youtube: https://www.youtube.com/watch?v=wkQzVNQdgW0

Each month folks from the Dask community give short demos that show off ongoing and/or lesser-known work. Hopefully this helps elevate some of the great work people do.

If you're interested in presenting, comment on this github issue with a brief (a couple sentences) description: https://github.com/dask/community/issues/307

0 Comments
2024/01/19
00:06 UTC

1

API Orchestration Solutions

Hi,

I am looking for an API Orchestrator solution.

Requirements:

  1. Given a list of API endpoints represented in a configuration of sequence and parallel execution, I want the orchestrator to call the APIs in the serial/parallel order as described in the configuration. The first API in the list will accept the input for the sequence, and the last API will produce the output.
  2. I am looking for an OpenSource library-based solution. I am not interested in a fully hosted solution. Happy to consider Azure solutions since I use Azure.
  3. I want to provide my customers with a domain-specific language (DSL) that they can use to define their orchestration configuration. The system will accept the configuration, create the Orchestration, and expose the API.
  4. I want to provide a way in the DSL for Customers to specify the mapping between the input/output data types to chain the APIs in the configuration.
  5. I want the call to the API Orchestration to be synchronous (not an asynchronous / polling model). Given a request, I want the API Orchestrator to execute the APIs as specified in the configuration and return the response synchronously in a few milliseconds to less than a couple of seconds. The APIs being orchestrated will ensure they return responses in the order of milliseconds.
1 Comment
2024/01/17
15:54 UTC

3

Distributed system engineer

I'm a mobile developer with 2.3 years experience. Now I want to become a distributed system engineer but I don't know where to start and what to learn. I know java, first should I learn to build rest api (with spring boot, hibernate). Should I find a new job for backend developer role Can anyone please guide me...

2 Comments
2024/01/17
09:38 UTC

1

Lamport distributed snapshot

Hello! I'm building a personal (ATM) repository with some simulations on distributed algorithms, I noticed there are few to zero public implementations and wanted to contribute in my manner.

I already wrote some (using threading + xmlrpc) for mutual exclusion, Lamport ME, leader elections etc but I'm stuck at a global snapshot sample algorithm using chandy lamport

Would any of you know some sources since I find mostly papers Thank you all

edit: if you already know some kind of implementations (also far, I'll adapt them) please let me know!

1 Comment
2024/01/16
23:58 UTC

8

Could A Database Lock Be Used To Elect A Transaction Coordinator?

I’m reading about consensus algorithms in “Designing Data Intensive Applications” and I had a sort of naive thought, so I want to know why it is wrong.

The author discusses the two phase commit protocol and the problem with two phase commit as a motivation for distributed consensus. What I got is basically that the leader may fail and leave all ready nodes in permanent limbo. But choosing a new leader would require consensus among the nodes.

So I have a rather naive solution. Why not just have a database somewhere that encodes the commit log? Then the leader would be whoever acquires either a lock or a valid token to append to that table, and the token would be something that you’d have to renew after a certain period of time. Whatever node controls the database would be delegated the task of deciding which requesting node actually gets the token.

So I imagine if this were so simple, that’s what people would do and this idea must be horribly stupid and naive — but I’m curious if someone is patient enough to explain why this wouldn’t work.

7 Comments
2024/01/01
00:16 UTC

Back To Top