Skip to content

Commit 4ad70a2

Browse files
committed
add llm_gateway_generate_text
1 parent b3e84ee commit 4ad70a2

4 files changed

Lines changed: 73 additions & 3 deletions

File tree

CHANGELOG.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,45 @@
11
# Changelog
22

3+
## 1.0.1
4+
5+
### Added
6+
7+
- **`llm_gateway_generate_text()` UDF wrapper for AI-powered DataFrame transformations.**
8+
9+
New method on proxy providers to generate AI completions in DataFrame operations via the `llm_gateway_generate` UDF.
10+
11+
```python
12+
from datacustomcode import Client
13+
from pyspark.sql.functions import col
14+
15+
client = Client()
16+
17+
# Generate summaries in a DataFrame column
18+
df = df.withColumn(
19+
"summary",
20+
client._proxy.llm_gateway_generate_text(
21+
"Summarize {company}: revenue={revenue}, CEO={ceo}",
22+
{
23+
"company": col("company"),
24+
"revenue": col("revenue"),
25+
"ceo": col("ceo")
26+
},
27+
llmModelId="sfdc_ai__DefaultGPT4Omni",
28+
maxTokens=200
29+
)
30+
)
31+
```
32+
33+
**Local Development:** Returns placeholder string (doesn't execute)
34+
**BYOC Production:** Calls real `llm_gateway_generate` UDF
35+
36+
**Parameters:**
37+
- `template` (str): Prompt template with {placeholder} syntax
38+
- `values` (dict or Column): Dict mapping placeholders to Columns, or pre-built named_struct
39+
- `llmModelId` (str): Model identifier (required, e.g., "sfdc_ai__DefaultGPT4Omni")
40+
- `maxTokens` (int): Maximum response length (required, e.g., 200)
41+
42+
343
## 1.0.0
444

545
### Breaking Changes

README.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ You should only need the following methods:
155155
* `write_to_dmo(name, spark_dataframe, write_mode)` – Write to a Data Lake Object by name with a Spark dataframe
156156

157157
For example:
158-
```
158+
```python
159159
from datacustomcode import Client
160160

161161
client = Client()
@@ -166,10 +166,34 @@ sdf = client.read_dlo('my_DLO')
166166
client.write_to_dlo('output_DLO')
167167
```
168168

169+
### LLM Gateway
169170

170-
> [!WARNING]
171-
> Currently we only support reading from DMOs and writing to DMOs or reading from DLOs and writing to DLOs, but they cannot mix.
171+
Generate AI completions in DataFrame transformations using the LLM gateway UDF.
172172

173+
```python
174+
from datacustomcode import Client
175+
from pyspark.sql.functions import col
176+
177+
client = Client()
178+
179+
# Use template with placeholders
180+
df = df.withColumn(
181+
"summary",
182+
client._proxy.llm_gateway_generate_text(
183+
"Summarize {company}: revenue={revenue}, CEO={ceo}",
184+
{
185+
"company": col("company"),
186+
"revenue": col("revenue"),
187+
"ceo": col("ceo")
188+
},
189+
llmModelId="sfdc_ai__DefaultGPT4Omni",
190+
maxTokens=200
191+
)
192+
)
193+
```
194+
195+
> [!WARNING]
196+
> This method returns a placeholder string in local development and won't execute. It only works when deployed, where it calls the real LLM Gateway service via the `llm_gateway_generate` UDF.
173197
174198
## CLI
175199

src/datacustomcode/proxy/client/LocalProxyClientProvider.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,6 @@ def __init__(self, **kwargs: object) -> None:
2727

2828
def call_llm_gateway(self, llmModelId: str, prompt: str, maxTokens: int) -> str:
2929
return f"Hello, thanks for using {llmModelId}. So many tokens: {maxTokens}"
30+
31+
def llm_gateway_generate_text(self, template, values, llmModelId: str, maxTokens: int):
32+
return f"Using Generate Text with {llmModelId} and maxTokens: {maxTokens}"

src/datacustomcode/proxy/client/base.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,6 @@ def __init__(self):
2525

2626
@abstractmethod
2727
def call_llm_gateway(self, llmModelId: str, prompt: str, maxTokens: int) -> str: ...
28+
29+
@abstractmethod
30+
def llm_gateway_generate_text(self, template, values, llmModelId: str, maxTokens: int): ...

0 commit comments

Comments
 (0)