# Quick Start For complete, runnable examples see the [lucidlink-python-examples](https://github.com/LucidLink/lucidlink-python-examples) repository. ## Installation ### Prerequisites - Python 3.10 or later - [Service account token](https://support.lucidlink.com/hc/en-us/articles/40222074543757-Getting-Started-with-Service-Accounts-API-Authentication) ### Install ```bash pip install lucidlink ``` ## Basic Usage ```python import lucidlink # Create and start daemon daemon = lucidlink.create_daemon() daemon.start() # Authenticate with service account credentials = lucidlink.ServiceAccountCredentials( token="sa_live:your_token_here" ) workspace = daemon.authenticate(credentials) # Link to a filespace filespace = workspace.link_filespace(name="production-data") # Read directory entries = filespace.fs.read_dir("/") for entry in entries: print(f"{entry.name}: {entry.size} bytes") # Write a file with filespace.fs.open("/example.txt", "wb") as f: f.write(b"Hello from LucidLink!") # Read a file with filespace.fs.open("/example.txt", "rb") as f: content = f.read() print(content) # Cleanup — unlink() automatically syncs pending changes to the hub filespace.unlink() daemon.stop() ``` ## File Operations ### Streaming File Access The library provides full `io.RawIOBase` compatibility for streaming. ```python # Binary streaming (read) with filespace.fs.open("/large_file.dat", "rb", buffering=8192) as f: for chunk in iter(lambda: f.read(4096), b""): process(chunk) # Text streaming with encoding (read) with filespace.fs.open("/document.txt", "rt", encoding="utf-8") as f: for line in f: print(line.strip()) # Byte range reads with filespace.fs.open("/data.bin", "rb") as f: f.seek(1000) data = f.read(100) # Read 100 bytes from offset 1000 # Streaming writes data = b"x" * 1024 * 1024 with filespace.fs.open("/output.dat", "wb") as f: for i in range(10): f.write(data) ``` ### fsspec Integration Access LucidLink files using the [fsspec](https://filesystem-spec.readthedocs.io/) interface: ```python from lucidlink.fsspec import LucidLinkFileSystem fs = LucidLinkFileSystem(token='sa_live:your_token_here', sandboxed=True) # List directory entries = fs.ls('lucidlink://workspace/filespace/', detail=True) # Download / upload files fs.get('lucidlink://workspace/filespace/file.txt', 'local_file.txt') fs.put('local_file.txt', 'lucidlink://workspace/filespace/uploaded.txt') # Move/rename (native operation, much faster than copy+delete) fs.mv('lucidlink://workspace/filespace/old.txt', 'lucidlink://workspace/filespace/new.txt') # Directory operations fs.mkdir('lucidlink://workspace/filespace/new_dir') fs.rmdir('lucidlink://workspace/filespace/empty_dir') fs.close() ``` With Pandas: ```python import pandas as pd # Read CSV directly from LucidLink df = pd.read_csv( 'lucidlink://workspace/filespace/data.csv', storage_options={'token': 'sa_live:your_token_here'} ) # Write Parquet to LucidLink df.to_parquet( 'lucidlink://workspace/filespace/output.parquet', storage_options={'token': 'sa_live:your_token_here'} ) ``` With Dask: ```python import dask.dataframe as dd # Read partitioned dataset ddf = dd.read_parquet( 'lucidlink://workspace/filespace/dataset/*.parquet', storage_options={'token': 'sa_live:your_token_here'} ) # Process with distributed computation result = ddf.groupby('category').agg({'value': 'sum'}).compute() ``` ### Filesystem Operations The `filespace.fs` object provides convenience methods for common filesystem operations. ```python # One-shot file I/O filespace.fs.write_file("/hello.txt", b"Hello, world!") content = filespace.fs.read_file("/hello.txt") print(content) # b"Hello, world!" # Directory operations filespace.fs.create_dir("/projects/2024") # List directory (names only) names = filespace.fs.list_dir("/projects") print(names) # ["2024"] # Read directory (full metadata) entries = filespace.fs.read_dir("/projects") for entry in entries: print(f"{entry.name}: {entry.size} bytes") # Delete files and directories filespace.fs.delete("/hello.txt") filespace.fs.delete_dir("/projects/2024") filespace.fs.delete_dir("/projects", recursive=True) # Move/rename filespace.fs.move("/old_name.txt", "/new_name.txt") # Check existence if filespace.fs.file_exists("/config.json"): data = filespace.fs.read_file("/config.json") if filespace.fs.dir_exists("/backups"): entries = filespace.fs.list_dir("/backups") # File/directory metadata info = filespace.fs.get_entry("/report.pdf") print(f"Size: {info.size}, Type: {info.type}") print(f"Modified: {info.mtime}") ``` ### File Locking Use the `lock_type` parameter on `open()` to coordinate file access across clients. Locks are managed by LucidHub and enforced across all connected clients. ```python # Shared lock — allows concurrent readers with filespace.fs.open("/data.csv", "rb", lock_type="shared") as f: data = f.read() # Exclusive lock — single writer, blocks other readers and writers with filespace.fs.open("/db.sqlite", "r+b", lock_type="exclusive") as f: content = f.read() f.seek(0) f.write(updated_content) ``` ## LucidLink Connect Attach existing S3 objects to a filespace as read-only files at arbitrary paths: ```python from lucidlink import S3DataStoreConfig connect = filespace.connect # Register an S3 data store connect.add_data_store("my-store", S3DataStoreConfig( access_key="AKIA...", secret_key="...", bucket_name="my-bucket", region="us-east-1", )) # Link S3 objects as files in the filespace connect.link_file( file_path="/proj1/dataset1/file1.csv", data_store_name="my-store", object_id="dataset1_file1.csv", ) # For bulk linking, provide size and checksum to skip S3 HeadObject calls connect.link_file( file_path="/proj1/dataset1/large.bin", data_store_name="my-store", object_id="dataset1_large.bin", size=1048576, checksum="abc123", ) # Read linked files through the filesystem — just like any other file filespace.sync_all() # Sync to see newly linked files with filespace.fs.open("/data/file.csv", "rb") as f: content = f.read() # List linked files (paginated) result = connect.list_external_files("my-store", limit=50) for path in result.file_paths: print(path) # Unlink when no longer needed connect.unlink_file("/data/file.csv") # Remove data store connect.remove_data_store("my-store") ``` ## Workspace Operations ### Discovering Filespaces ```python workspace = daemon.authenticate(credentials) # List all filespaces the authenticated user can access filespaces = workspace.list_filespaces() for fs in filespaces: print(f"{fs.name}: {fs.id}") # Link to a specific filespace filespace = workspace.link_filespace(name=filespaces[0].name) ``` ### Context Managers Use context managers for automatic lifecycle management: ```python import lucidlink with lucidlink.create_daemon() as daemon: credentials = lucidlink.ServiceAccountCredentials( token="sa_live:your_token_here" ) workspace = daemon.authenticate(credentials) # Filespace context manager — auto sync + unlink on exit with workspace.link_filespace(name="my-filespace") as filespace: filespace.fs.write_file("/output.txt", b"data") # filespace.sync_all() + filespace.unlink() called automatically # daemon.stop() called automatically ``` ## Configuration ### Syncing Changes By default, `filespace.unlink()` automatically calls `sync_all()` before disconnecting (controlled by the `sync_mode` parameter on `link_filespace()`). This ensures all write operations are committed to the hub. If you need to verify changes are visible to other clients *before* unlinking, call `sync_all()` explicitly: ```python filespace.fs.write_file("/data.txt", b"important data") filespace.sync_all() # Explicitly sync — changes are now visible to other clients # ... continue working with the filespace ``` To disable automatic syncing on unlink, use `SyncMode.SYNC_NONE`: ```python from lucidlink import SyncMode filespace = workspace.link_filespace(name="data", sync_mode=SyncMode.SYNC_NONE) # ... write operations ... filespace.sync_all() # Caller is responsible for syncing filespace.unlink() # Will NOT auto-sync ``` ### Storage Modes #### Sandboxed Mode (Default) Uses a temporary directory that's automatically cleaned up: ```python daemon = lucidlink.create_daemon() ``` #### Physical Mode Uses a persistent `.lucid` folder: ```python # With cleanup on exit daemon = lucidlink.create_daemon(sandboxed=False) # Keep files after exit daemon = lucidlink.create_daemon( sandboxed=False, persist_files=True ) # Custom storage location daemon = lucidlink.create_daemon( sandboxed=False, persist_files=True, root_path="D:/lucid_data" ) ``` ### Error Handling The SDK provides a hierarchy of exceptions for different failure modes: ```python from lucidlink.exceptions import ( LucidLinkError, # Base class for all SDK errors DaemonError, # Daemon start/stop failures AuthenticationError, # Invalid credentials, expired tokens FilespaceError, # Filespace link/unlink, filesystem errors ConfigurationError, # Invalid parameters or configuration ) try: daemon = lucidlink.create_daemon() daemon.start() credentials = lucidlink.ServiceAccountCredentials(token="sa_live:your_token_here") workspace = daemon.authenticate(credentials) filespace = workspace.link_filespace(name="my-filespace") except AuthenticationError as e: print(f"Auth failed: {e}") except FilespaceError as e: print(f"Filespace error: {e}") except LucidLinkError as e: print(f"SDK error: {e}") ```