You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# New Developer Guide - Data Cloud Custom Code Python SDK
2
+
3
+
Welcome to the Salesforce Data Cloud Custom Code Python SDK! This guide will help you get started with development and contribution to this repository.
4
+
5
+
## 🚀 Quick Start
6
+
7
+
### Prerequisites
8
+
9
+
See the [Prerequisites section in README.md](./README.md#prerequisites) for complete setup requirements.
10
+
11
+
### Initial Setup
12
+
13
+
1.**Clone the repository**
14
+
```bash
15
+
git clone <repository-url>
16
+
cd datacloud-customcode-python-sdk
17
+
```
18
+
19
+
2.**Set up virtual environment and install Poetry**
20
+
```bash
21
+
python3 -m venv .venv
22
+
source .venv/bin/activate
23
+
pip install poetry
24
+
poetry build
25
+
```
26
+
27
+
3.**Install dependencies**
28
+
```bash
29
+
# Install main dependencies
30
+
poetry install --only main
31
+
32
+
# Install development dependencies
33
+
poetry install --with dev
34
+
```
35
+
36
+
4.**Verify installation**
37
+
```bash
38
+
poetry run datacustomcode version
39
+
```
40
+
41
+
## 🔧 Makefile Commands
42
+
43
+
The project includes a comprehensive Makefile for common development tasks:
44
+
45
+
```bash
46
+
# Clean build artifacts, caches and temporary files
47
+
make clean
48
+
49
+
# Build package distribution
50
+
make package
51
+
52
+
# Install main dependencies only
53
+
make install
54
+
55
+
# Install dependencies for full development setup
56
+
make develop
57
+
58
+
# Run code quality checks
59
+
make lint
60
+
61
+
# Perform static type checking
62
+
make mypy
63
+
64
+
# Run complete test suite
65
+
make test
66
+
```
67
+
68
+
---
69
+
70
+
**Welcome to the community!** If you have any questions or need help getting started, don't hesitate to create an issue in the repository or reach out to the maintainers through the project's communication channels.
Copy file name to clipboardExpand all lines: README.md
+46-12Lines changed: 46 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ Use of this project with Salesforce is subject to the [TERMS OF USE](./TERMS_OF_
8
8
9
9
## Prerequisites
10
10
11
-
- Python 3.11 (If your system version is different, we recommend using [pyenv](https://github.com/pyenv/pyenv) to configure 3.11)
11
+
-**Python 3.11 only** (currently supported version - if your system version is different, we recommend using [pyenv](https://github.com/pyenv/pyenv) to configure 3.11)
- Docker support like [Docker Desktop](https://docs.docker.com/desktop/)
14
14
- A salesforce org with some DLOs or DMOs with data and this feature enabled (it is not GA)
@@ -69,6 +69,13 @@ datacustomcode run ./payload/entrypoint.py
69
69
> The example entrypoint.py requires a `Account_Home__dll` DLO to be present. And in order to deploy the script (next step), the output DLO (which is `Account_Home_copy__dll` in the example entrypoint.py) also needs to exist and be in the same dataspace as `Account_Home__dll`.
70
70
71
71
After modifying the `entrypoint.py` as needed, using any dependencies you add in the `.venv` virtual environment, you can run this script in Data Cloud:
72
+
73
+
**Adding Dependencies**: To add new dependencies:
74
+
1. Make sure your virtual environment is activated
75
+
2. Add dependencies to `requirements.txt`
76
+
3. Run `pip install -r requirements.txt`
77
+
4. The SDK automatically packages all dependencies when you run `datacustomcode zip`
You can now use the Salesforce Data Cloud UI to find the created Data Transform and use the `Run Now` button to run it.
81
88
Once the Data Transform run is successful, check the DLO your script is writing to and verify the correct records were added.
82
89
90
+
## Dependency Management
91
+
92
+
The SDK automatically handles all dependency packaging for Data Cloud deployment. Here's how it works:
93
+
94
+
1.**Add dependencies to `requirements.txt`** - List any Python packages your script needs
95
+
2.**Install locally** - Use `pip install -r requirements.txt` in your virtual environment
96
+
3.**Automatic packaging** - When you run `datacustomcode zip`, the SDK automatically:
97
+
- Packages all dependencies from `requirements.txt`
98
+
- Uses the correct platform and architecture for Data Cloud
99
+
100
+
**No need to worry about platform compatibility** - the SDK handles this automatically through the Docker-based packaging process.
101
+
83
102
## API
84
103
85
104
Your entry point script will define logic using the `Client` object which wraps data access layers.
@@ -174,25 +193,40 @@ Options:
174
193
175
194
## Docker usage
176
195
177
-
After initializing a project with `datacustomcode init my_package`, you might notice a Dockerfile. This file isn't used for the
178
-
[Quick Start](#quick-start) approach above, which uses virtual environments, until the `zip` or `deploy` commands are used. When using dependencies
179
-
that include [native features](https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html#using-pyspark-native-features)
180
-
like C++ or C interop, the platform and architecture may be different between your machine and Data Cloud compute. This is all taken care of
181
-
in the `zip` and `deploy` commands, which utilize the Dockerfile which starts `FROM` an image compatible with Data Cloud. However, you may
182
-
want to build, run, and test your script on your machine using the same platform and architecture as Data Cloud. You can use the sections below
183
-
to test your script in this manner.
196
+
The SDK provides Docker-based development options that allow you to test your code in an environment that closely resembles Data Cloud's execution environment.
197
+
198
+
### How Docker Works with the SDK
199
+
200
+
When you initialize a project with `datacustomcode init my_package`, a `Dockerfile` is created automatically. This Dockerfile:
201
+
202
+
-**Isn't used during local development** with virtual environments
203
+
-**Becomes active during packaging** when you run `datacustomcode zip` or `deploy`
204
+
-**Ensures compatibility** by using the same base image as Data Cloud
205
+
-**Handles dependencies automatically** regardless of platform differences
184
206
185
207
### VS Code Dev Containers
186
208
187
209
Within your `init`ed package, you will find a `.devcontainer` folder which allows you to run a docker container while developing inside of it.
188
210
189
211
Read more about Dev Containers here: https://code.visualstudio.com/docs/devcontainers/containers.
212
+
#### Setup Instructions
190
213
191
214
1. Install the VS Code extension "Dev Containers" by microsoft.com.
192
-
1. Open your package folder in VS Code, ensuring that the `.devcontainer` folder is at the root of the File Explorer
193
-
1. Bring up the Command Palette (on mac: Cmd + Shift + P), and select "Dev Containers: Rebuild and Reopen in Container"
194
-
1. Allow the docker image to be built, then you're ready to develop
195
-
1. Now if you open a terminal (within the Dev Container window) and `datacustomcode run ./payload/entrypoint.py`, it will run inside a docker container that more closely resembles Data Cloud compute than your machine
215
+
2. Open your package folder in VS Code, ensuring that the `.devcontainer` folder is
216
+
at the root of the File Explorer
217
+
3. Bring up the Command Palette (on mac: Cmd + Shift + P), and select "Dev
218
+
Containers: Rebuild and Reopen in Container"
219
+
4. Allow the docker image to be built, then you're ready to develop
220
+
221
+
#### Development Workflow
222
+
223
+
Once inside the Dev Container:
224
+
-**Terminal access**: Open a terminal within the container
225
+
-**Run your code**: Execute `datacustomcode run ./payload/entrypoint.py`
226
+
-**Environment consistency**: Your code will run inside a docker container that more closely resembles Data Cloud compute than your machine
227
+
228
+
> [!TIP]
229
+
> **IDE Configuration**: Use `CMD+Shift+P` (or `Ctrl+Shift+P` on Windows/Linux), then select "Python: Select Interpreter" to configure the correct Python Interpreter
196
230
197
231
> [!IMPORTANT]
198
232
> Dev Containers get their own tmp file storage, so you'll need to re-run `datacustomcode configure` every time you "Rebuild and Reopen in Container".
0 commit comments