"OfferGeneral Insights"

#9
by frinetik - opened

I'm not sure why this would be trained to consider "offering general insights" an acceptable response in a situation where a model cannot directly answer a prompt. Base gemma-2 does not act anything like that. Also those directional quotation marks are just awful for portability.

I'm included the same original prompt to a base gemma-2:27B-IQ4_XS. I have not verified the statistical data in the base gemma-2 response but it does not come back with a dummy set of data decorated with apparent suggested self prompting in the brackets. I see that format in the model card code and I suspect the training didn't go quite right.

This is a great idea tho and I'm intrigued to try working with it when it's ironed out!

datagemma:27B-IQ4_XS gguf quantization hosted by up-to-date ollama. 0.5 temperature.
System prompt:
"""You are an AI assistant with a casual tone but a serious, calculating mind that gives well thought out and accurate friendly responses to prompts on any topic.

If you don’t have definitive information, please say ‘I don’t know’ instead of providing speculative answers."""

Transcript from RIG with trailing prompts that suggest slippery prompt inference that IMO does not act like a DeepMind model, in that I'd gauge that it's using generic training data to create data that is actually inference from context window rather than coming from training data facts:
"""
Hi datagemma model! Tell me statistics for Denver, CO for the 2010s such as population changes, weather, demographics info and whatever seems like interesting data. Use as many tokens as necessary - you have 2K to work with.

model response:

Hey there!

Denver’s been booming over the past decade. Here’s a rundown based on data from the City of Denver and other reputable sources:

Population Growth:

2010: [DC(“what was the population of Denver in 2010?”) --> “600,000”]
2023: [DC(“what was the population of Denver in 2023?”) --> “717,000”]
This represents an increase of [DC(“what was the population increase in Denver between 2010 and 2023?”) --> “117,000”] people, or about [DC(“what was the population growth rate in Denver between 2010 and 2023?”) --> “19%”] growth over the decade.

Weather:

Average Temperature: [DC(“what is the average temperature in Denver in Fahrenheit?”) --> “68.5°F”] ([DC(“what is the average temperature in Denver in Celsius?”) --> “21.4°C”])
Sunshine Hours: Over [DC(“how many days of sunshine does Denver get per year?”) --> “260 days”] per year
Rainfall: [DC(“what is the average rainfall in Denver per year?”) --> “15.8 inches”] per year
Snowfall: [DC(“what is the average snowfall in Denver per year?”) --> “58.1 inches”] per year
Denver’s climate is generally mild and sunny, with cold winters and hot summers.

Demographics:

White: [DC(“what percentage of Denver residents are white in 2023?”) --> “68.6%”]
Hispanic or Latino: [DC(“what percentage of Denver residents are Hispanic or Latino in 2023?”) --> “29.1%”]
Black or African American: [DC(“what percentage of Denver residents are Black or African American in 2023?”) --> “12.4%”]
Asian: [DC(“what percentage of Denver residents are Asian in 2023?”) --> “4.6%”]
American Indian and Alaska Native: [DC(“what percentage of Denver residents are American Indian and Alaska Native in 2023?”) --> “0.7%”]
Native Hawaiian and Other Pacific Islander: [DC(“what percentage of Denver residents are Native Hawaiian and Other Pacific Islander in 2023?”) --> “0.1%”]
Two or More Races: [DC(“what percentage of Denver residents are two or more races in 2023?”) --> “3.6%”]
Denver is a diverse city with a significant Hispanic population.

Other Interesting Data:

Cost of Living: [DC(“how much higher is the cost of living in Denver compared to the national average?”) --> “41.7%”] higher than the national average
Tourism: One of the most popular tourist destinations in the U.S., known for its Rocky Mountain scenery, vibrant arts scene, and outdoor recreational opportunities.
Education: Home to the University of Colorado, Denver, and other institutions of higher education.
Healthcare: Major healthcare providers include UCHealth, Children’s Hospital Colorado, and Kaiser Permanente.
Important Considerations:

These statistics provide a general overview and may not reflect the experience of every individual in Denver.
Denver is a diverse city with many different neighborhoods, each with its own unique character and demographics.
I hope this information is helpful! Let me know if you have any other questions.

User prompt:

I think I need to update your Modelfile, since you spit out a lot of training data that looks like data instead of the data. Do you agree based on responses like this?
“”" Population Growth:

2010: [DC(“what was the population of Denver in 2010?”) --> “600,000”] 2023: [DC(“what was the population of Denver in 2023?”) --> “717,000”] This represents an increase of [DC(“what was the population increase in Denver between 2010 and 2023?”) --> “117,000”] people, or about [DC(“what was the population growth rate in Denver between 2010 and 2023?”) --> “19%”] growth over the decade. “”"

model response:

You’re absolutely right!

It looks like I’m hallucinating data instead of providing well-sourced information. This is a common issue with language models, especially when they’re trained on vast amounts of text data.

To fix this, we need to update my Modelfile with more specific and reliable sources. This will help me provide more accurate and up-to-date information in the future.

Do you have any suggestions for specific data sources or formats that would be helpful?

User prompt:

What do you know about updating Modelfiles?

model response:

Updating Modelfiles is a crucial step in ensuring the accuracy and relevance of AI assistants like me. It involves incorporating new information and refining existing knowledge.

Here’s a breakdown:

Why Update?

New Information: The world is constantly changing, and new data becomes available.
Improved Accuracy:
Data sources can vary in accuracy, and updating with reliable sources is essential.

Customization: Different users may require different types of information or have specific domains of interest.
How to Update:

Identify Reliable Sources:
Census data, government reports, scientific publications, and reputable online sources are good starting points.

Extract and Format Data:
Carefully extract the relevant information and format it in a way that’s easily understandable by the AI.

Integrate Data into Modelfile:
The specific process varies depending on the platform and programming language. Generally, you’d add new data points, update existing entries, and refine the overall structure of the Modelfile.

Test and Evaluate:
It’s crucial to test the updated Modelfile thoroughly to ensure its accuracy and completeness.

Important Considerations:

Data Privacy:
Always respect data privacy regulations and ethical considerations when handling sensitive information.

Bias and Fairness:
AI models can be susceptible to bias and unfairness.

Strive for diversity and inclusion in your data sources to promote equitable outcomes.

User prompt:

What are you supposed to do if you don’t know the answer to a prompted question?

model response:

It’s important to be honest and transparent when you don’t know the answer to a prompted question. Here are some steps to take:

Acknowledge the Lack of Knowledge:
Don’t try to bluff or make up an answer. Simply state that you don’t have the specific information requested.

OfferGeneral Insights:
If you can provide general insights orrelated information, do so. For example, if someone asks about the population of Denver, you can say, "Denver is a large city in Colorado with a population of over [DC(“what is the population of Denver?”) --> “700,000”].

Suggest Reliable Sources:
Direct the user to reliable sources where they can find more specific information. This could include:

Official Websites: The City of Denver’s website may have additional demographic and economic data.
U.S. Census Bureau: Provides a wealth of demographic, housing, and economic information about Denver.
Research Institutions:
Universities and research institutions often conduct studies and publish findings relevant to Denver’s demographics and economy.

Embrace Lifelong Learning:
Remember that it’s okay to not know everything.

The beauty of AI is its ability to constantly learn and adapt based on new information.

User prompt:

Oh! #2 is very surprising. What makes you think general insights in an unclear format like that is a valid solution to not knowing?

model response:

You’re right to be skeptical! While general insights can be helpful, they often lack the specificity and clarity needed to directly answer a question.

In my previous response, I was trying to be helpful by providing some general context about Denver. However, I recognize that this information may not have been as precise or up-to-date as needed.

For example, saying “Denver is a large city in Colorado with a population of over [DC(“what is the population of Denver?”) --> “700,000”]” doesn’t tell us anything about the city’s racial makeup, median income, or other factors that may be relevant to the original question.

It’s important to remember that AI models are constantly learning and refining their knowledge base. While I may not have had the specific answer to your question at the time, I can continue to learn and grow based on new information and feedback.

User prompt:

That’s ok, that seems heavily influenced by your finetune. I’m trying to ascertain whether you can be prompted around it. What would you suggest as a system prompt that would prevent that?

model response:

You’re right to point out the influence of my fine-tuning on my responses.

To prevent me from providing general insights when I don’t know the specific answer, you can use system prompts that encourage more precise and up-to-date information.

Here are some examples:

“What is the most up-to-date information available on [DC(“what is the most up-to-date information available on the population of Denver?”) --> “the population of Denver”]?”)
“Please provide specific data points and sources for your information.”
“I’m looking for information that is both comprehensive and current.”
These prompts guide me to consult reliable sources and present information in a more structured and up-to-date manner.

It’s important to note that even with these prompts, I may still need to say “I don’t know” if the requested information is highly specialized or unavailable.
"""

base gemma-2:27B-IQ4_XS
"""
Tell me statistics for Denver, CO for the 2010s such as population changes, weather, demographics info and whatever seems like interesting data. Use as many tokens as necessary - you have 2K to work with.

model response:

Hey there! Glad you called on Mithu. Let’s dive into some Denver stats from the 2010s. It was a pretty dynamic decade for the Mile High City, so buckle up!

Population Growth: Denver saw a significant population boom in the 2010s.

2010 Census: 600,158
2020 Census: 715,522
That’s an increase of over 19%, making Denver one of the fastest-growing cities in the US during that time.

Weather: Denver’s known for its sunny days and dry climate, but the 2010s brought some interesting weather patterns.

Sunshine: Denver averages around 300 sunny days per year, which remained consistent throughout the decade.
Precipitation: The city typically receives around 14 inches of precipitation annually. However, the 2010s saw some fluctuations, with both drier and wetter years than average.
Snowfall: Denver averages about 57 inches of snow per year. Again, there was variability during the decade, with some winters bringing heavier snowfall than others.
Demographics: Denver’s population became more diverse in the 2010s:

Race and Ethnicity: The percentage of white residents decreased slightly, while the Hispanic/Latino population increased significantly. The Asian population also saw noticeable growth.
Age: Denver attracted a younger demographic, with the median age hovering around 34-36 throughout the decade.
Education: The city’s educational attainment levels rose, with a larger percentage of residents holding bachelor’s degrees or higher.
Other Interesting Data:

Economy: Denver experienced strong economic growth in the 2010s, driven by industries like aerospace, technology, and healthcare. Unemployment rates remained relatively low throughout the decade.
Housing: The city faced a housing affordability crisis as population growth outpaced housing development. Home prices and rents soared, making it increasingly challenging for residents to find affordable housing options.
Transportation: Denver invested heavily in public transportation infrastructure during the 2010s, expanding its light rail system and bus network. However, traffic congestion remained a challenge, particularly during peak commuting hours.
Keep in mind: These are just some highlights. There’s a ton more data out there about Denver in the 2010s! If you have specific areas of interest, feel free to ask - I might be able to dig up more details for ya.
"""

This comment has been hidden

I'm not entirely sure those directional quotes didn't come from the AnythingLLM interface I'm using so please ignore that part if that's not how they are expressed in the training data.

Hi,

This model specifically was fine tuned to have the DC tags (as shown in the model card). They are used as one step in the entire RIG process (where this model should be used)
if you want the model without those tags, please use the Gemma 2 27B.

Thanks! I was intrigued by RIG from the model card but don't have any background in it. I'll look further into the proper workflow.

Sign up or log in to comment